In many cases, FileMaker databases get data from outside sources: Quickbooks, Excel, SAP, etc. Based on the way those external spreadsheets or systems are setup, the data may or may not be in the format that FileMaker wants it. Here’s where we start dealing with the issue of parsing.
Parsing entails taking something that’s all stuck together and breaking it into smaller pieces. For example, we could parse out the area code from a 10-digit phone number, or we can extract the month, day, and year from a birth date. However, what we often face is not as simple and can require more effort to break apart. But have no fear – this article is here to get your brain thinking about ways to approach parsing.
Often, we use a field with an auto-enter calculation or a calculation field to store the parsed data. For example, if I had a class code like ENG101, I may want to put ENG and 101 in different fields so that I can easily search both separately. Additionally, I can write a calculation that determines that the abbreviation “ENG” stands for “English”.
Here are the steps I go through in order to figure out what my calculation will look like:
- Review multiple samples of the data to be parsed. If you only use one example, you’re bound to miss the nuances of “more complex” examples.
- Look for patterns. Ask yourself, “If I had to do this manually, what would I be looking for?” Look for things like:
- X always starts with Y
- There is always an X before/after Y (where X could be a comma, semicolon, colon, etc.)
- X will always be Y number of characters
- Don’t focus too much on extraneous stuff. What I mean is there are great tools in FileMaker to filter (hint, it’s the “Filter” function!) text down to the “wanted” characters.
- Love and respect the Let function. The beauty of the let function is that it lets you take parsing one step at a time, with minimal confusion. See the examples below for proof. Also, use “white space” or spaces/returns within your calculation so that it’s easy to re-read your calculation later, along with comments to explain what’s going on.
- Test, test, test. In most cases, the Data Viewer is the best way to parse out text so you can see your calculation working against live data.
Here is a list of common functions when parsing data:
- Left or LeftWords
- Right or RightWords
- Middle or MiddleWords
- Position
- Filter
- Substitute
- PatternCount
- Length
When parsing data, the calculation you write usually goes in one of two places: a calculation field or a text/number/date field with an auto-enter calculation. The latter allows the user to modify the result if there’s some nuance, while the former does not.
Example 1
You have a part number that has some logic built into it and you want to take that logic from a code to something a human understands.
PartNumber 1: TX-157566WH3 PartNumber 2: VA-156513WH2
The format for the part number is State – Part Serial Number & Warehouse Number.
Parse 1a – Let’s get the state first. (Note: The 2-letter state always has a dash after it. Always celebrate the easy victories!)
Let( [ pn = data::PartNumber ; dash = “-” ; pos = Position( pn ; dash ; 1 ; 1 ) ] ; // Minus 1 so that we don’t include the dash Left( pn ; pos - 1 ) )
Parse 1b – Let’s get the Part Number. (Note: The part number is between the dash and the “WH” that stands for warehouse).
Let( [ pn = Data::PartNumber ; dash = "-" ; wh = "WH" ; // Find the first occurrence of the dash pos1 = Position( pn ; dash ; 1 ; 1 ) ; // Find the first occurrence of "WH" pos2 = Position( pn ; wh ; 1 ; 1 ) ] ; /* Plus 1 to start on the character after the dash; minus 1 to compensate for starting 1 character to the right than the dash */ Middle( pn ; pos1 + 1 ; pos2 - pos1 - 1 ) )
Parse 1c – How do we get the warehouse number? We are looking for whatever is after “WH”.
Let ( [ pn = Data::PartNumber ; len = length( pn ) ; wh = "WH" ; pos = Position( pn ; wh ; 1 ; 1 ) ] ; // Minus 1 to not get the "H" in "WH" Right( pn ; len - pos - 1 ) )
Example 2
You are tracking payments in QuickBooks and at some point, the data gets pulled into FileMaker. Your accountant uses the Memo field to store important data that should go into different FileMaker fields.
Memo 1: Partial PMT for Project# 19-1A3152567 Memo 2: Full PMT for Project# 19-1A3152567, Project#79-13687AB87 Memo 3: Ptl PMT, Project 19-12562567, #79-12667DB17
Parse 2a – We need to know if it’s a partial payment or a full payment, based on the memo field. (Note: I’m looking for specific words – “partial” “full” and “ptl” so I can use PatternCount to do that.)
Let ( [ f = data::memo ; // “f” for “field” t1 = “full” ; // t1 for “text 1” t2 = “partial” ; // t2 for “txt 2” t3 = “ptl” ; // t3 for “text 3” // Find how many times “full” appears in the memo field pc1 = PatternCount( f ; t1 ) ; // Find how many times “partial” or “ptl” appears in the memo field pc2 = PatternCount( f ; t2 ) + PatternCount( f ; t3 ) ; Case( pc1 > 0 ; “Full” ; pc2 > 0 ; “Partial” ; “”) )
Parse 2b – We want to get a value list (return-delimited) of all the projects referenced in a memo. (Note: The word “Project” is the starting marker for the project numbers and it looks like they go until the end of the field, each separated by commas. The word “project” could appear in front of every project, but might not. Because the # is extraneous, I can filter or substitute that out.)
Let ( [ // "Substitute" is case sensitive so by making it all upper case, I don't have to worry about it. f = Upper (Data::memo) ; // The number of characters in the memo field len = Length( f ) ; // Our “marker” t1 = "PROJECT" ; // On what character does the first occurrence of "project" start? pos = Position( f ; t1 ; 1 ; 1 ) ; // Grab everything from "Project" to the end rtext = Right( f ; len - pos + 1 ); // Substitutes “PROJECT” to nothing, commas with a return, # to nothing. sub = Substitute( rtext ; [ "PROJECT" ; "" ] ; [ ", " ; ¶ ] ; [ "#" ; "" ] ) ] ; // The If statement checks to see if the word “Project” was found at all. If( pos > 0 ; Trim( sub ) ; "" ) )
Talk about a lot of code! The good news: once the calculation is written, your job as the developer is done. Record after record, the text will get parsed out automatically. The key lessons are learning to recognize patterns, as well as using the let function in order to “walk through” the logic, instead of it being all jumbled in one long calculation. It also makes for easier debugging. Think of it as algebra. What’s easier to debug and decipher?
Let ( [ Y = Z + 4 – 3A X = Y + 3 ] ; X + 4Y – Y/(5-Y) )
Without the “X” and “Y” meaning something, we’d get something messier like:
((Z + 4 – 3A) + 3) + 4(Z + 4 – 3A) – (Z + 4 – 3A)/(5-(Z + 4 – 3A))
And what if you have to go in and change “Y” to “Z + 3 – 4A”? In the first example, you change it one place. In the bottom example, you have to hunt for every occurrence of it.
Parsing and the Let function are all about trial and error, pattern recognition, and patience! Also, there are usually multiple ways to solve the same problem, so don’t feel like there’s only one right way.
Have you faced any parsing problems lately? How were you able to solve the issue? Comment and let me know!