ruby - How to extract data from text columns -
i have 2 addresses side-by-side in multi-line string:
adresse de prise en charge : adresse d'arrivée : rue des capucines rue des tilleuls 92210 saint cloud 67000 strasbourg tél.: tél.:
i need extract addresses on left , right regexp, , assign them variables. need match:
address1
:"rue des capucines 92210 saint cloud"
address2
:"rue des tilleuls 67000 strasbourg"
i thought of separating them spaces, cant find regexp count spaces. tried:
en\s*charge\s*:\s*((.|\n)*)\s*
and similar, gives me both addresses, , not i'm looking for. appreciated.
assuming each address section in each line indented as or further corresponding "adresse"
in first line, following can extract not 2 addresses aligned sidewards, n addresses in general.
lines = string.split(/#{$/}+/) # => [ # => "adresse de prise en charge : adresse d'arrivée :", # => " rue des capucines rue des tilleuls", # => " 92210 saint cloud 67000 strasbourg", # => " tél.: tél.:" # => ] break_points = [] lines.first.scan(/\badresse\b/){break_points.push($~.begin(0))} ranges = break_points.push(0).each_cons(2).map{|s, e| s..(e - 1)} # => [0..53, 54..-1] address1, address2 = lines[1..-2] .map{|s| ranges.map{|r| s[r]}} .transpose .map{|a| a.join(" ").strip.squeeze(" ")} # => [ # => "rue des capucines 92210 saint cloud", # => "rue des tilleuls 67000 strasbourg" # => ]
Comments
Post a Comment