June 28th, 2024 at 2:47:07 PM
permalink
A package called "camelot." It basically does it in one line of code
For 2004 to 2018, i used pypdf. It just extracted everything as text and then i used a bunch of splitting the strings to format it how i wanted
And yes. This is basically how im learning to program. I have a problem and I figure out how to solve it. The code may not ever be the cleanest or the most efficient way of doing it but ive always been able to figure out what i need
For 2004 to 2018, i used pypdf. It just extracted everything as text and then i used a bunch of splitting the strings to format it how i wanted
And yes. This is basically how im learning to program. I have a problem and I figure out how to solve it. The code may not ever be the cleanest or the most efficient way of doing it but ive always been able to figure out what i need
June 29th, 2024 at 5:30:39 AM
permalink
What I read about Camelot is that it is prone to errors if the formatting of the table changes in subtle ways. Camelot depends on the format of the table to figure out which text to extract. The data glitches were probably not your coding errors, but rather changes in the table format due to extraneous factors like the pandemic.Quote: VegasEducationA package called "camelot." It basically does it in one line of code
For 2004 to 2018, i used pypdf. It just extracted everything as text and then i used a bunch of splitting the strings to format it how i wanted
And yes. This is basically how im learning to program. I have a problem and I figure out how to solve it. The code may not ever be the cleanest or the most efficient way of doing it but ive always been able to figure out what i need
link to original post
My text-based approach would also fail if the column layout of the table changes. I should probably use awk to make sure the extracted table data looks right.
I installed Camelot and will play around with it. I have to download many of my W2G forms in PDF format and I need a better way of turning them into CSV lists.
Last edited by: Mental on Jun 29, 2024
Gambling is a math contest where the score is tracked in dollars. Try not to get a negative score.