i want to split my text file into 1332 different files at specific part and save in single folder
-
I have a file containing 1332 cif molecules. i want to convert them into 1332 different cif molecule files in a single folder.
-
You need a programming language and some coding skills in order to do that.
It is not something that Notepad++ can help you with.
Sorry and good luck. -
Hello, @abhishek-sharma, @alan-kilborn and All,
You said :
I have a file containing 1332 cif molecules. i want to convert them into 1332 different cif molecule files in a single folder.
Well, if each cif molecule corresponds to a single text line, here is a very easy solution :
-
Rename your initial file as
File_All.txt -
Create a new folder in your laptop
-
Move to this new folder
-
Copy the
File_All.txtfile in this folder -
Now, from this link https://code.google.com/archive/p/gnu-on-windows/downloads , download the
gawk-4.1.0-bin.ziparchive, -
Double-click on this archive
-
Extract the unique file
gawk.exe -
Open a CMD prompt window
-
Type in the command
gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txtand valid -
After a while, when finished, you should get
1332files, ( fromFile_1.txttoFile_1332.txt), which contain one line, each !
Now, @abhishek-sharma, if each cif molecule is NOT a single line, just show us your file containing the
1332cif molecules or part of this file !Of course, I going to get some flak, from @alan-kilborn, for showing something that does not concern Notepad++, ( so off-topic ), but hey, the solution is so simple !
Best Regards,
guy038
P.S. :
Anyone can easily test my solution ! For example, if a
File_All.txtcontains17lines, after execution, you’ll get17new files, fromFile_1.txttoFile_17.txt -
-
@guy038 said :
Of course, I going to get some flak, from @alan-kilborn, for showing something that does not concern Notepad++, ( so off-topic )
Here’s your flak. As promised. :-)
We can’t devolve into solving everyone’s non-N++ problem.
If we do, it becomes uninteresting for those that want to see Notepad++ problems presented and solved here. -
@guy038 People come and ask here because they don’t know where else to ask, so please do help people like us even if it is unrelated to Notepad++ and you know the solution.
-
Please don’t answer cookie baking questions.
If it’s borderline about text editing, like the question above, then the best thing for the original poster is to be pointed that “Notepad++ cannot do this; a tool that might be able to is XYZ, but you will have to look elsewhere to find help for using that tool, because this is a Notepad++ forum, not an XYZ forum.”
If it’s possible to answer in 1-2 sentences about the other tool, then as long as it’s presented with a caveat, “but further questions should be directed to an XYZ forum”, it will usually be tolerated. And with @guy038 's thousands of helpful Notepad++ specific answers, we just politely point out to him when he’s pushing the boundaries (like this time).
But this is a Notepad++ forum, not a general computer -help forum, and not even a text-transformation forum. Please keep questions and answers on topic.
-
I was reading through @guy038’s reply and got excited when I read “very easy solution” and so continued to read until I saw …gnu…download the gawk-4.1.0-bin.zip…
I’ll be a bit more sneaky and do this nearly all in Notepad++.
Step 1 - Do a search/replace with:
Search:
^
Replace:>>"%XFILE%" echo.Step 2 - Do another search/replace with:
Search:
(?-i)^(>>"%XFILE%" echo.data_(.+))
Replace:set XFILE=data_\2.cif\r\necho Generating %XFILE%\r\n\1Step 3
Add one line at the top of the file with@echo offand then save the file asfilename.batwhere filename is something you pick.step 4
Run the newly created batch file. It should generate 1332 separate files nameddata_something.cifwhere something is thedata_...name.Explanation - I made the assumption that each molecule starts with a
data_...line and using those to generate the file name for each molecule. The search/replace steps convert the file from .cif file format into a batch script that generates the .cif file content. -
@mkupper said:
I’ll be a bit more sneaky and do this nearly all in Notepad++.
And to me this is much more palatable of a solution.
Although it doesn’t entirely use Notepad++, it uses something external to Notepad++ that is present on every PC; specificallyCMD.exewhich is used to “run the newly created batch file”.
We don’t want an extensive discussion of batch/CMD here, however. -
Hi, @abhishek-sharma, @alan-kilborn, @peterjones, @dr-ramaanand, @mkupper and All,
@peterjones, you said :
…we just politely point out to him when he’s pushing the boundaries (like this time).
Peter, I do thank you for your phrasing !
Note that I could have said, in my first post :
-
Hit the
F5key -
Paste
cmd /c gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txtin the zone -
Click on the
Runbutton
On the other hand, I could have created a batch file, from within N++, called
Split.bat:@echo off cmd /c gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txtand then, use the
Run > Run...option and paste eithercmd /c Split.batorcmd /c $(FILE_NAME)!
Now, I’m going to tease you a little ! In these three posts, below, you’re actually using a similar approach than mime :
https://community.notepad-plus-plus.org/post/26531
https://community.notepad-plus-plus.org/post/46881
https://community.notepad-plus-plus.org/post/50945
I grant you that, in the first post, you began your post with :
This is not a coding help forum.
And, regarding the second post, I understand that you were testing if the output of a long_line_text file was correct or not on a printer device !
Now, here is a solution which only uses the
cmdcommand within N++ and, thus, would keep itson-topicstatus !- Let’s suppose this INPUT text, pasted in the
File_All.txtfile :
Here is a small text to test if my batch file works as expected-
Copy the
File_All.txtasFile_For_Each_Line.bat -
Open the
File_For_Each_Line.batfile in Notepad++ -
Add an empty line at the very beginning ( IMPORTANT )
-
Choose the
Language > B > Batchmenu option -
Choose the
Encoding > Convert to ANSImenu option ( IMPORTANT ) -
Open the Replace dialog (
Ctrl + H) -
Untick all the box otpions
-
FIND
(?-s)(.+)|\A\R -
REPLACE
(?1set /a NUM+=1 & echo \1 > file_%NUM%.txt:@echo OFF\r\nchcp 1252 > NUL\r\nset NUM=1\r\n) -
Tick the
Wrap aroundoption -
Select the
Regular expressionsearch mode -
Click once on the
Replace Allbutton -
Save the modifications of the
File_For_Each_Line.batfile
Now :
-
Hit the
F5key -
Paste
cmd /c File_For_Each_Line.batin the zone -
Click on the
Runbutton
or
-
Open a DOS command prompt
-
Type
File_For_Each_Line.batand hit theEnterkey
=> You should get, in current directory, eight new files, from
file_1.txttofile_8.txt, each containing one line of the initialFile_All.txtVoila !
Best Regards,
guy038
-
-
@guy038 ,
Sure am glad I didn’t suggest a dBASE Plus solution for this problem and end up in that list. <g,d,r>
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login