Let's Read – Eloquent Ruby – Ch 5 - DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Ask questions Research chat β†’

https://dev.to/baweaver/lets-read-eloquent-ruby-ch-5-28k8 · scraped

ruby

Attachments

β–Ό

Scraped Content

β€” 1650 words · 2026-02-14 17:41:58 UTC ·

Excerpt

Perhaps my personal favorite recommendation for learning to program Ruby like a Rubyist, Eloquent Ruby is a book I recommend frequently to this day. That said, it was released in 2011 and things have changed a bit since then. This series will focus on reading over Eloquent Ruby, noting things that may have changed or been updated since 2011 (around Ruby 1.9.2) to today (2021 β€” Ruby 3.0.x). Note: This is an updated version of a previous unfinished Medium series of mine you can find here. ## Chapter 5. Find the Right String with Regular Expressions This chapter focuses on Regular Expressions, or Regex for short. It's one of the most powerful concepts in programming around text manipulation, but also probably one of the most confusing. If you haven't already heard of these two I would highly suggest using them while exploring Regex: - Rubular - Basic Ruby-centric Regex tester - Regexr - More advanced, explains each segment of a Regex Personally I tend to use Rubular more, but most
Perhaps my personal favorite recommendation for learning to program Ruby like a Rubyist, Eloquent Ruby is a book I recommend frequently to this day. That said, it was released in 2011 and things have changed a bit since then. This series will focus on reading over Eloquent Ruby, noting things that may have changed or been updated since 2011 (around Ruby 1.9.2) to today (2021 β€” Ruby 3.0.x). Note: This is an updated version of a previous unfinished Medium series of mine you can find here. ## Chapter 5. Find the Right String with Regular Expressions This chapter focuses on Regular Expressions, or Regex for short. It's one of the most powerful concepts in programming around text manipulation, but also probably one of the most confusing. If you haven't already heard of these two I would highly suggest using them while exploring Regex: - Rubular - Basic Ruby-centric Regex tester - Regexr - More advanced, explains each segment of a Regex Personally I tend to use Rubular more, but mostly because the extra info Regexr presents is a bit too noisy for me. In either case I would highly suggest running examples from this chapter and experimenting with them in either tool. With that said, let's get into it. ### Introductory Examples The chapter opens with a few examples, like to start with why do you want Regex? The book uses the String "09:24 AM" as an example. How can you tell it's a time? AM or PM? 24H or 12H? Maybe even ambiguous. These are a lot of questions that can be a bit difficult to answer with just String methods, requiring something a bit more powerful. Think of Regex like a method of describing the shape of text. "09:24 AM" is composed of two digits, a colon, two digits, a space, and AM or PM. Regex is a language that lets us say exactly that: ```plain text # Regex starts and ends with a `/`, or surrounded by %r{} time_match = /\d{2}:\d{2} (AM|PM)/ # time_match = %r{\d{2}:\d{2} (AM|PM)} time_match.match? "09:24 AM" # => true ``` Now that's all a bit dense to start out with, so let's step back along with the book to get into a few more examples. ### Matching One Character at a Time The book lists a few examples, but let's turn those into code real quick: ```plain text # The regular expression x will match x. /x/.match? "x" # => true # The regular expression aaa will match three a’s all in a row. /aaa/.match? "aaa" # => true # The regular expression 123 will match the first three numbers. /123/.match? "123" # => true # The regular expression R2D2 will match the name of a certain sci-fi robot. /R2D2/.match? "R2D2" # => true /R2D2/.match? "r2d2" # => false (case sensitive) ``` ### Special Characters Now those could all have been == compares instead, so let's look at a few more interesting characters: - . - Matches any one character. - - Matches zero or more of whatever comes before it. - + - Matches one or more of whatever comes before it. Going back to examples: ```plain text # The regular expression . will match any single-character # string including r and % and ~. dot_match = /./ dot_match.match? "r" # => true dot_match.match? "%" # => true dot_match.match? "~" # => true # In the same way, two periods ( .. ) will match any two characters, # perhaps xx or 4F or even [!, but won’t match Q since it’s one, # not two, characters long. double_dot_match = /../ double_dot_match.match? "xx" # => true double_dot_match.match? "4F" # => true double_dot_match.match? "[!" # => true double_dot_match.match? "Q" # => false (one character) ``` ### Literal Characters There are some characters you want to match an actual dot, so how does one get Regex to do that? With a backslash: ```plain text # \. will match a literal dot. /\./.match? "." # => true # 3\.14 will match the string version of PI to two decimal places, # complete with the decimal point: 3.14 /3\.14/.match? "3.14" # => true # Mr\. Olsen will match exactly one thing: Mr. Olsen /Mr\. Olsen/.match? "Mr. Olsen" # => true ``` ### Combining Effects The book then goes into a few combos, let's turn those into examples: ### Sets, Ranges, and Alternatives Say you wanted a character out of a set of them, Regex enables this with []: Think of them as inclusion in a set of characters. The book then goes on into a few more examples here: ### Ranges Now if that all seems a bit tedious there's the concept of a range in Regex: ### Common Set Shortcuts ...and the even more useful common set shortcuts: ### Alternatives The last in this section is the alternative, which you can think more of as "OR": The book goes on to mention that you can use as many alternatives as you would like, but also sneaks in group captures (()) here which I don't believe it gets into later, but trust me when I say that's one of the most useful parts of Regex. ### The Regular Expression Star Interestingly we mentioned this above in special characters, the star (*) stands for zero or more of whatever is before it, and the plus (+) stands for one or more. There's one more idea here with specifying count, but that's an item for later. The book mentions the following examples: If we were to switch to + that last case wouldn't work. The book then goes on to mention that sets, ranges, and common sets work with * as well. Really, anything does: That last one the book mentions can be extremely useful, and is frequently used to make more flexible patterns: Personally I would advocate for being more explicit about what you expect, lest you match more than you intended. Perhaps you do want to match a lot more, that's fine too, but make sure that's the case. ### Regular Expression Counts The book does not mention this, but it's an important subject to bring up: counts. Star is used for zero or more, plus for one or more, question for optional, but what about if I wanted something like 4 to 5 instances? There are four count matches you'll want to be aware of: Do note though that unless used in conjunction with the section "Beginnings and Ends" coming up it won't work as intended, so be sure to give that a look and see if you can spot the flaws in the above matches. ### Regular Expressions in Ruby Up to this point the book is just mentioning the Regex language without really getting into the Ruby implementation. For me and this article, however, I used Ruby implementations to show how it would work, so a lot of this will seem familiar. I'll give an overview instead of what it mentions. ### Equal Squiggly (=~) The equal squiggly sign is used for matching in Ruby, though it's not the clearest syntax: Why zero? That's the position in the string it found the match at. If there was nothing in there we'd get nil back instead. Personally I prefer match? as it returns back an explicit true or false, is faster, and very rarely do I need to know the direct index of something. ### Regex Flags The book does sneak a fast one in here with Regex flags like i which makes things case-insensitive: There are several more you can find at the bottom of rubular, but the common ones I use are i for case-insensitive and x for whitespace-insensitive. ### Methods Taking Regex There are also methods like sub, gsub, scan, and others which take in a Regex, like this example the book provides: Note: I did add the i there, where the book omits it. Case-insensitive would be more flexible here. ### Beginnings and Ends The book then mentions that the Regex we've used so far are unbounded, meaning they match anywhere in a string. There are a few more special expressions that allow us to specify beginning and end of line, and beginning and end of strings: Especially when dealing with user input you want to be exceptionally strict about this, and most Rails security tools are going to give you grief over omitting explicit beginning and ending of String signifiers in your Regex. ### In the Wild The book mentions a real-world usecase as timezone offsets in time.rb, notedly numeric ones like -07:00 or +08:00, with this line: The book mentions question mark (?) as being an optional character, meaning there could be a colon there, or there could not be. Now the interesting part, and what the book wants to highlight, is that if the Regex isn't matched it's compared against a set of values like UTC: ...which mixes the usefulness of Regex with the usefulness of set inclusion. That said, one could also do this: ...which I've gotten a good deal of mileage out of in the past ### Staying Out of Trouble The book mentions watching out for using == accidentally in place of =~, though to avoid that I would still actively recommend using match? instead as clear naming means a lot when reading your code later. The second it mentions is 0 being falsy in C-like languages, despite being truthy in Ruby and representing something was found at the 0th index of a String. ### Wrapping Up There's a ton to cover any time Regex comes up, and the book gives a solid start, though I do really wish they had spent a bit of time on capture groups and counts. I may do a writeup or addendum to this chapter later on capture groups if there's interest, let me know! Next up we'll have Symbols, one of the more confusing aspects of Ruby.

Visibility

Visible to everyone

Reading Status

Related Bookmarks

My Note


Saved!

Annotations

Export as Markdown
+ Annotate selection

Add Annotation