[Pdx-pm] Parsing CSV Files

Andrew Clapp andrew.clapp at gmail.com
Mon Mar 2 18:48:21 PST 2020


Hey all.  One more thing with a different file for the project that
has unescaped " chars in one of the fields (that contains raw html
descriptions).  Text::CSV throws this:

# CSV_XS ERROR: 2034 - EIF - Loose unescaped quote @ rec 0 pos 4 field 1

I just started working on this one, but nothing super obvious has
popped up yet.  I probably need to qw() or qq() something before the
parse.

Any ideas are appreciated.


-ASC

On Sun, Mar 1, 2020 at 8:48 AM Andrew Clapp <andrew.clapp at gmail.com> wrote:
>
> Hey Richard,
>
> Yeah, I've done that when I'm splitting it up manually, and then you
> have to also discard the leading " on the first record and the
> trailing " on the last one.  I'm really liking the Text::CSV(_XS)
> route.  It's working great.  I just tell it what to do and it's doing
> it and giving me great clean data.  Laziness for the win.
>
> -ASC
>
> On Sat, Feb 29, 2020 at 2:04 PM Richard Case <caserichard at gmail.com> wrote:
> >
> > Maybe you could use "," as your delimiter (i.e., those three characters).  Sounds like fun :)
> >
> > On Sat, Feb 29, 2020, 4:20 AM Tina Müller <cpan2 at tinita.de> wrote:
> >>
> >> The standard nowadays is actually Text::CSV.
> >> It will automatically use Text::CSV_XS as the backend if available, and
> >> otherwise Text::CSV_PP.
> >>
> >> On Fri, 28 Feb 2020, Andrew Clapp wrote:
> >>
> >> > Thanks!  I'll check it out.  Weird name for a "standard".
> >> >
> >> > -ASC
> >> >
> >> > On Fri, Feb 28, 2020 at 6:30 PM Andy Lester <andy at petdance.com> wrote:
> >> >>
> >> >> Text::CSV_XS is pretty much the standard.
> >> >>
> >> >>> On Feb 28, 2020, at 8:27 PM, Andrew Clapp <andrew.clapp at gmail.com> wrote:
> >> >>>
> >> >>> Hello folks, I'm seeking advice on a module if anyone out there knows
> >> >>> of something that's already there.
> >> >>>
> >> >>> I'm looking at a few different options from cpan but I'm stuck on
> >> >>> finding a ready made solution for this one.  There's gotta be
> >> >>> something I missed.  Here's the problem in brief.
> >> >>>
> >> >>> I have to parse a CSV file, but it's double-quote wrapped, with commas
> >> >>> in the fields.
> >> >>>
> >> >>> Example with a header...
> >> >>>
> >> >>> "ID","name","desc","detailed desc"
> >> >>> "1234","thing","A nifty phrase that's easy to read","some, list, of
> >> >>> things, with commas, not so easy"
> >> >>>
> >> >>> I've tried Pasrse::CSV which looks promising, and tried doing it
> >> >>> myself, which works, but it's kludgey beyond useful legibility.  I
> >> >>> believe there's a good way to do this that I've not seen yet.
> >> >>>
> >> >>> Ideas?
> >> >>>
> >> >>> Thanks for looking.
> >> >>>
> >> >>> -ASC
> >> >>>
> >> >>>
> >> >>> --
> >> >>>
> >> >>> Andrew S. Clapp
> >> >>> Aeonic Enterprises
> >> >>>
> >> >>> "They're always searching for the magic bullet, and actually it's the
> >> >>> culmination of a lot of different things."  -Ken Fischer
> >> >>> _______________________________________________
> >> >>> Pdx-pm-list mailing list
> >> >>> Pdx-pm-list at pm.org
> >> >>> https://mail.pm.org/mailman/listinfo/pdx-pm-list
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Andrew S. Clapp
> >> > Aeonic Enterprises
> >> >
> >> > "They're always searching for the magic bullet, and actually it's the
> >> > culmination of a lot of different things."  -Ken Fischer
> >> > _______________________________________________
> >> > Pdx-pm-list mailing list
> >> > Pdx-pm-list at pm.org
> >> > https://mail.pm.org/mailman/listinfo/pdx-pm-list
> >> >
> >> _______________________________________________
> >> Pdx-pm-list mailing list
> >> Pdx-pm-list at pm.org
> >> https://mail.pm.org/mailman/listinfo/pdx-pm-list
>
>
>
> --
>
> Andrew S. Clapp
> Aeonic Enterprises
>
> "They're always searching for the magic bullet, and actually it's the
> culmination of a lot of different things."  -Ken Fischer



-- 

Andrew S. Clapp
Aeonic Enterprises

"They're always searching for the magic bullet, and actually it's the
culmination of a lot of different things."  -Ken Fischer


More information about the Pdx-pm-list mailing list