So, I recently bought Quicken for the first time, so that I could do line item tax deductions. Too bad my main credit card company, Chase, artificially limits the data I can import to Quicken. Only 45 days worth. I needed the whole of 2006 to do taxes.
Chase does advertise 6 years worth of statements are always available, but the devil is in the details. They only let you have pdf’s of each month’s statement.
It’s late, so here is the short. I learned Ruby and wrote a script to convert a folder full of these pdf’s into Quickens XML format, QFX. Well, I actually used the free PdfTextReader to initially pull the text out of the pdf’s. I had started using the java library, iText, to do the pdf manipulation, but they say that their library cannot pull text out of pdf’s. So, faster to use someone else’s tool.
Had to brush up a lot on my regular expressions as well. The pdftextreader pulls the text out of the pdf’s differently each time, so you can’t just count spacing to find the different columns. You have to come up with a pattern that matches for thousands of transactions. In the end, I altered a few foreign transactions by hand rather than figuring out special code for them. Also, the automatic payments by Chase broke their normal pattern too.
Anyway, all the data back into 2005 is in Quicken now! So, I can go through and find the line items. (I found out today that my tax bill is pretty steep for some 1099 work in 2006, so hopefully this will keep my travel the world plans alive.)
Originally, I was going to write this program in Java, and post it for the Quicken community to use. I figure there are many others who are in the same jam, and they might hit the donation button if I save them many hours of tedious labor. However, after iText revealed that it couldn’t help me with the pdf’s, I changed my position to “how can Ed get this done fastest for himself?” Find a pdf text extracting tool and use a language like Ruby.
Worked out pretty well. I’m satisfied, even if it conceivably took me longer to write the program than to cut and paste by hand. I’d rather spend the time gaining programming skills than getting carpel tunnel and being bored.
If anyone wants the ruby code, just leave a comment. Maybe I’ll post a link to it tomorrow.