I am looking at speed of writing to file vs a pipe. Please look at this code, which writes to a file handle unless there is a command line argument, otherwise it writes to a pipe:
#include <unistd.h>#include <stdio.h>#include <stdlib.h>#include <iostream>#include <chrono>#include <string.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>using namespace std;void do_write(int fd){ const char* data = "Hello world!"; int to_write = strlen(data), total_written = 0; int x = 0; auto start = chrono::high_resolution_clock::now(); while (x < 50000) { int written = 0; while (written != to_write) { written += write(fd, data + written, to_write - written); } total_written += written;++x; } auto end = chrono::high_resolution_clock::now(); auto diff = end - start; cout << "Total bytes written: " << total_written << " in " << chrono::duration<double, milli>(diff).count() << " milliseconds, " << endl;}int main(int argc, char *argv[]){ // // Write to file if we have not specified any extra argument // if (argc == 1) { { int fd = open("test.txt", O_WRONLY | O_TRUNC | O_CREAT, 0655); if (fd == -1) return -1; do_write(fd); } return 0; } // // Otherwise, write to pipe // int the_pipe[2]; if (pipe(the_pipe) == -1) return -1; pid_t child = fork(); switch (child) { case -1: { return -1; } case 0: { char buf[128]; int bytes_read = 0, total_read = 0; close(the_pipe[1]); while (true) { if ((bytes_read = read(the_pipe[0], buf, 128)) == 0) break; total_read += bytes_read; } cout << "Child: Total bytes read: " << total_read << endl; break; } default: { close(the_pipe[0]); do_write(the_pipe[1]); break; } } return 0;}
Here is my output:
$ time ./LinuxFlushTest pipeTotal bytes written: 600000 in 59.6544 milliseconds,real 0m0.064suser 0m0.020ssys 0m0.040sChild: Total bytes read: 600000$ time ./LinuxFlushTestTotal bytes written: 600000 in 154.367 milliseconds,real 0m0.159suser 0m0.028ssys 0m0.132s
You can see writing to the pipe is way faster than the file from both the time
output and my C++ code timing.
Now, from what I know, when we call write()
the data will be copied to a kernel buffer, at which point a pdflush
style thread will actually flush it from the page cache to the underlying file. I am not forcing this flush in my code so there is no disk seeking delay.
But what I don't know (and can't seem to find out: and yes, I've looked at the kernel code but get lost in it, so no comments like "look at the code" please) is what different happens when writing to a pipe: is it not just a block of memory in the kernel somewhere that the child can read from? In that case, why is it so much faster than the basically identical process of writing to a file?