-
-
Notifications
You must be signed in to change notification settings - Fork 78
[EDI] Handle segment compression #114
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Comments
@DGollings
). If you can post your spec, or shoot me an email of your spec and sample data (Is
full sample or a section of the sample?) and your schema, I can take a deeper look. |
sure, had a look around but can't find your e-mail? |
jf dot tech dot llc at gmail.com |
What you discovered is what we encountered too in the past. There are so many optional
It's nearly impossible (as far as I'm aware) to deterministically parse such The problem isn't as trivial as what you described (aka stack popping). This eventually becomes a DFA or NFA matching problem (bit like regex): imagine we look at an input file vertically where each seg line is presented by a single character, now you can imagine this becomes actually regex pattern matching problem. As you are aware, regex pattern matching isn't deterministic and in extreme cases runtime can be exponential because of backtracking. So we decided to implement our current greedy algorithm, basically the As far as we're aware, the only other comprehensive EDI open source library https://www.smooks.org/ uses the same logic. I'm not sure how IBM/Oracle/MSFT implement their logic I doubt they go all the way to do DFA/NFA matching. What it means is: it's kinda hopeless, nor wise, to attempt to implement an EDI schema that is literal and verbatim to a partner spec. We chose to live with the limitation and deal with individual channel and inspect input constructs and work with partner to verify how they generate such EDIs for that particular channel - exactly what you're doing here. |
@DGollings let me know if I can close the issue or there is more to discuss. |
Oh no, the only possible 'trivial' solution would be something like this
If there's four segments don't do this:
but this
But that only works for very defined (and implicit) situations. I would barely know where to begin to implementing this:
With the same four segments as input So agree, the current greedy match is best. And a debug mode would help the user figure out the hopelessness of attempting to implement the specs as designed :) What might help anyone encountering this problem (mixed and unknown mandatory/conditional) is using a custom func:
With input being something like This returns an object with each 'type' in its own section. |
Disclaimer: I only assume this is segment compression, as defined in the manual
7.1 Exclusion of segments
Conditional segments containing no data shall be omitted
(including their segment tags).
This is what I encountered in the schema, basically a mandatory/conditional sandwich.
None of the conditional statements were present in the data I was trying to parse, ended up fixing it using:
The message I'm trying to parse
Which basically means, grab the two explicit ones (luckily at top), and do as you wish with the others in whatever order you encounter them. I'm not sure how I would have handled it if I did care about NAD+LP
Also had to use min/max 1 instead of the specified 99, as it only considers NAD, not NAD+FIRSTVALUE when 'collapsing' similar but not same segments.
Basically, the EDI specification has a lot of implicitness which I think is quite hard to easily parse.
The text was updated successfully, but these errors were encountered: