Today, the project and development colleagues reported a problem. They said that the program wrote files to the server, writing 1,000 lines each time. After writing, there was only a 9M file in total, but it took nearly several minutes to write it. The developer even added a log for the time taken to write each batch to troubleshoot the problem, but he didn't find the problem.
I was puzzled by this problem. I even suspected that it was a problem with the program, so I checked it. First, I used df -hT
to find that the persistent file directory mounted by the program was nfs remote storage. So I confidently suspected that it must be because the mounted nfs had a network quality problem that caused the slow writing. Then I used the dd
command to write a 100M file to test the storage rate. But the strange thing was that it was written in an instant. I felt like I was slapped in the face. I dded a 4GB file to test it. It was also written in an instant. At that time, I "felt" that although it was written, the background might still be uploading files to the remote nfs bit by bit. I looked at the network monitoring and was stunned. The intranet network rate on site even reached an astonishing 1Gb/s. . . .
This is strange. I was depressed and found a group of experts. I asked the experts in the group. One of them said: "It depends on how you write it. There is a big difference between writing 10M once and writing 10K 1000 times." I am not well educated. Then make a script to simulate the functions of the development classmates to reproduce the problem😂😂😂
Preparation
Sample data
First, prepare a 10,000-line text full of Hello for testing:
$ yes "Hello" | head -n 10000 > 1w-hello.txt
$ wc -l 1w-hello.txt
10000 1w-hello.txt
Script preparation
Prepare a script for test writing, and write the file content to the target location at a speed of 1,000 lines each time
$ cat speet_test.sh
#!/bin/bash
# Source file
input_file=$1
# Target file
output_file=$2
line_count=0
SECONDS=0
batch_size=1000
> "$output_file"
# Read the input file line by line
while IFS= read -r line; do
# Write the current line to the output file
echo "$line" >> "$output_file"
((line_count++))
# Check if the batch size is reached
if [ "$line_count" -eq "$batch_size" ]; then
line_count=0
echo "-------- Batch Separator --------" >> "$output_file"
echo "The current target file size: $(du -sh ${output_file}|awk '{print $1}') Number of lines: $(wc -l ${output_file}|awk '{print $1}') Total time taken: ${SECONDS}s"
fi
done < "$input_file"
echo "Processing complete."
Start the test
First, write the text content to the nfs storage at a speed of one thousand lines per batch:
$ sh speet_test.sh 1w-hello.txt /nas/1w-hello-nfs.txt
# Return result
Current target file size: 8.0K Number of lines: 1001 Total time consumed: 9s
Current target file size: 12K Number of lines: 2002 Total time consumed: 18s
Current target file size: 20K Number of lines: 3003 Total time consumed: 26s
Current target file size: 24K Number of lines: 4004 Total time consumed: 37s
Current target file size: 32K Number of lines: 5005 Total time consumed: 46s
Current target file size: 36K Number of lines: 6006 Total time consumed: 55s
Current target file size: 44K Number of lines: 7007 Total time consumed: 64s
Current target file size: 48K Number of lines: 8008 Total time taken: 73s
Current target file size: 56K Number of lines: 9009 Total time taken: 83s
Current target file size: 60K Number of lines: 10010 Total time taken: 90s
Processing complete.
See this result... I... wrote a file with only 10,000 lines and a size of only 60KB, and it took 9 seconds to write only 1,000 lines...
Next, we continue the test and write the file to the local to see the speed
$ sh speet_test.sh 1w-hello.txt /tmp/1w-hello-local.txt
# Return results
Current target file size: 8.0K Number of lines: 1001 Total time consumed: 0s
Current target file size: 12K Number of lines: 2002 Total time consumed: 0s
Current target file size: 20K Number of lines: 3003 Total time consumed: 0s
Current target file size: 24K Number of lines: 4004 Total time consumed: 0s
Current target file size: 32K Number of lines: 5005 Total time consumed: 0s
Current target file size: 36K Number of lines: 6006 Total time consumed: 0s
Current target file size: 44K Number of lines: 7007 Total time consumed: 0s
Current target file size: 48K Number of lines: 8008 Total time consumed: 0s
Current target file size: 56K Number of lines: 9009 Total time consumed: 0s
Current target file size: 60K Number of lines: 10010 Total time consumed: 0s
Processing complete.
Sure enough, using nfs storage is still a little bit tricky in this regard. I asked GPT but didn't get a good result. After giving feedback to the developer, I changed the program persistence directory to local storage, and the effect really changed qualitatively. 😂