Hello there fellows, I made this post to illustrate some problems that GSplit first time users may encounter, it took me a while to figure out how to get the desired results.
First of all to keep in mind, GSplit does it’s magic on a byte by byte basis, so get a hex editor to do your tests, any available for free will do.
I needed to split SQL Server Management Studio ‘generate scripts’ files and the first issue I found is that when I tried to split a file after the nth occurrence of 0x0D0x0A din’t work at all. Well, that is because UTF-16 and UCS-2 LE BOM use 2 bytes for each character displayed on your text editor, such as notepad++. If you want it to correctly split after each CR/LF you will need to set the search pattern to: “0x0D0x000x0A0x00”
[Type and Size][Blocked Pieces][I want to split after the nth occurrence of a specified pattern]
Inserting custom headers:
Again, remember the byte by byte processing? well, if you’re working with a 2-byte encoded you will need to translate the text you want to insert to hex code, that hex editor you downloaded will help you here… For example, I will add "My custom header.[CR][LF]"
What I did was to create a new text file with the same enconding, write that text down and open it with the hex editor, then copied the hex dump and pasted on Notepad++,
here is what it looks like:
since it was just a hex dump and GSplit uses 0x to denote hex, you can use the search and replace RegEx as follows:
search for “(…)” two contiguous characters and replace by “0x\1” (quotes shouuld be ignored)
And you end up with:
and just one more thing, if you’re using a By Order Mark (BOM) you will need to insert that first, in my case it was UCS-2 LE BOM so I needed to add 0xff0xfe to the very beggining of it.
[Other properties][Tags and headers][Do not add GSplit tags to pieces (checked)][Insert additional header to pieces(checked)][Insert the following line (special characters allowed)]
I hope this post will help future GSplit users because I believe it’s a very powerful tool, however it is assumed in most posts I’ve seen that you’re working on a 1-byte encoded file… so these are my two cents.