Linux系统调用篇1—vfork、fork、clone

五月 15th, 2011 by klose | Posted under 编程点滴.

下面是转自一个来自与JavaEye的blog:http://memorymyann.iteye.com/blog/235638 下面是blog的内容:大部分的内容都写得很好

 fork,vfork,clone都是linux的系统调用,用来创建子进程的(确切说vfork创造出来的是线程)。
 先介绍下进程必须的4要点:
 a.要有一段程序供该进程运行,就像一场戏剧要有一个剧本一样。该程序是可以被多个进程共享的,多场戏剧用一个剧本一样。
 b.有起码的私有财产,就是进程专用的系统堆栈空间。
 c.有“户口”,既操作系统所说的进程控制块,在linux中具体实现是task_struct
 d.有独立的存储空间。
 当一个进程缺少d条件时候,我们称其为线程。
 1.fork 创造的子进程复制了父亲进程的资源,包括内存的内容task_struct内容(2个进程的pid不同)。这里是资源的复制不是指针的复制。下面的例子可以看出
 [root@liumengli program]# cat testFork.c
 #include"stdio.h"
 int main() {
         int count = 1;
         int child;
         if(!(child = fork())) { //开始创建子进程
                 printf("This is son, his count is: %d. and his pid is: %d\n", ++count, getpid());//子进程的内容
         } else {
                 printf("This is father, his count is: %d, his pid is: %d\n", count, getpid());
         }
 }
 [root@liumengli program]# gcc testFork.c -o testFork
 [root@liumengli program]# ./testFork
 This is son, his count is: 2. and his pid is: 3019
 This is father, his count is: 1, his pid is: 3018
 [root@liumengli program]#
 从代码里面可以看出2者的pid不同,内存资源count是值得复制,子进程改变了count的值,而父进程中的count没有被改变。有人认为这样大批 量的复制会导致执行效率过低。其实在复制过程中,子进程复制了父进程的task_struct,系统堆栈空间和页面表,这意味着上面的程序,我们没有执行 count++前,其实子进程和父进程的count指向的是同一块内存。而当子进程改变了父进程的变量时候,会通过copy_on_write的手段为所 涉及的页面建立一个新的副本。所以当我们执行++count后,这时候子进程才新建了一个页面复制原来页面的内容,基本资源的复制是必须的,而且是高效 的。整体看上去就像是父进程的独立存储空间也复制了一遍。

 其次,我们看到子进程和父进程直接没有互相干扰,明显2者资源都独立了。我们看下面程序
 [root@liumengli program]# cat testFork.c
 #include"stdio.h"
 int main() {
         int count = 1;
         int child;
         if(!(child = fork())) {
                 int i;
                 for(i = 0; i < 200; i++) {
                         printf("This is son, his count is: %d. and his pid is: %d\n", i, getpid());
                 }
         } else {
                 printf("This is father, his count is: %d, his pid is: %d\n", count, getpid());
         }
 }
 [root@liumengli program]# gcc testFork.c -o testFork
 [root@liumengli program]# ./testFork
 ...
 This is son, his count is: 46. and his pid is: 4092
 This is son, his count is: 47. and his pid is: 4092
 This is son, his count is: 48. and his pid is: 4092
 This is son, his count is: 49. and his pid is: 4092
 This is son, his count is: 50. and his pid is: 4092
 This is father, his count is: 1, his pid is: 4091
 [root@liumengli program]# This is son, his count is: 51. and his pid is: 4092
 This is son, his count is: 52. and his pid is: 4092
 ...
 (运气很衰,非要200多个才有效果,郁闷)从结果可以看出父子2个进程是同步运行的。这和下面的vfork有区别。

 2.vfork创建出来的不是真正意义上的进程,而是一个线程,因为它缺少了我们上面提到的进程的四要素的第4项,独立的内存资源,看下面的程序
 [root@liumengli program]# cat testVfork.c
 #include "stdio.h"
 int main() {
         int count = 1;
         int child;
         printf("Before create son, the father's count is:%d\n", count);
         if(!(child = vfork())) {
                 printf("This is son, his pid is: %d and the count is: %d\n", getpid(), ++count);
                 exit(1);
         } else {
                 printf("After son, This is father, his pid is: %d and  the count is: %d, and the child is: %d\n", getpid(), count, child);
         }
 }
 [root@liumengli program]# gcc testVfork.c -o testVfork
 [root@liumengli program]# ./testVfork
 Before create son, the father's count is:1
 This is son, his pid is: 4185 and the count is: 2
 After son, This is father, his pid is: 4184 and the count is: 2, and the child is: 4185
 [root@liumengli program]#
 从运行结果可以看到vfork创建出的子进程(线程)共享了父进程的count变量,这一次是指针复制,2者的指针指向了同一个内存,所以子进程修改了 count变量,父进程的  count变量同样受到了影响。另外由vfork创造出来的子进程还会导致父进程挂起,除非子进程exit或者execve才会唤起父进程,看下面程序:
 [root@liumengli program]# cat testVfork.c
 #include "stdio.h"
 int main() {
         int count = 1;
         int child;
         printf("Before create son, the father's count is:%d\n", count);
         if(!(child = vfork())) {
                 int i;
                 for(i = 0; i < 100; i++) {
                         printf("This is son, The i is: %d\n", i);
                         if(i == 70)
                                 exit(1);
                 }
                 printf("This is son, his pid is: %d and the count is: %d\n", getpid(), ++count);
                 exit(1);
         } else {
                 printf("After son, This is father, his pid is: %d and  the count is: %d, and the child is: %d\n", getpid(), count, child);
         }
 }
 [root@liumengli program]# gcc testVfork.c -o testVfork
 [root@liumengli program]# ./testVfork
 ...
 This is son, The i is: 68
 This is son, The i is: 69
 This is son, The i is: 70
 After son, This is father, his pid is: 4433 and the count is: 1, and the child is: 4434
 [root@liumengli program]#
 从这里就可以看到父进程总是等子进程执行完毕后才开始继续执行。

 3.clone函数功能强大,带了众多参数,因此由他创建的进程要比前面2种方法要复杂。clone可以让你有选择性的继承父进程的资源,你可以选 择想vfork一样和父进程共享一个虚存空间,从而使创造的是线程,你也可以不和父进程共享,你甚至可以选择创造出来的进程和父进程不再是父子关系,而是 兄弟关系。先有必要说下这个函数的结构
 int clone(int (*fn)(void *), void *child_stack, int flags, void *arg);
 这里fn是函数指针,我们知道进程的4要素,这个就是指向程序的指针,就是所谓的“剧本", child_stack明显是为子进程分配系统堆栈空 间(在linux下系统堆栈空间是2页面,就是8K的内存,其中在这块内存中,低地址上放入了值,这个值就是进程控制块task_struct的 值),flags就是标志用来描述你需要从父进程继承那些资源, arg就是传给子进程的参数)。下面是flags可以取的值
 标志                    含义
    CLONE_PARENT   创建的子进程的父进程是调用者的父进程,新进程与创建它的进程成了“兄弟”而不是“父子”
    CLONE_FS           子进程与父进程共享相同的文件系统,包括root、当前目录、umask
    CLONE_FILES      子进程与父进程共享相同的文件描述符(file descriptor)表
    CLONE_NEWNS   在新的namespace启动子进程,namespace描述了进程的文件hierarchy
    CLONE_SIGHAND   子进程与父进程共享相同的信号处理(signal handler)表
    CLONE_PTRACE   若父进程被trace,子进程也被trace
    CLONE_VFORK     父进程被挂起,直至子进程释放虚拟内存资源
    CLONE_VM           子进程与父进程运行于相同的内存空间
    CLONE_PID          子进程在创建时PID与父进程一致
    CLONE_THREAD    Linux 2.4中增加以支持POSIX线程标准,子进程与父进程共享相同的线程群
 下面的例子是创建一个线程(子进程共享了父进程虚存空间,没有自己独立的虚存空间不能称其为进程)。父进程被挂起当子线程释放虚存资源后再继续执行。
 [root@liumengli program]# cat test_clone.c
 #include "stdio.h"
 #include "sched.h"
 #include "signal.h"
 #define FIBER_STACK 8192
 int a;
 void * stack;
 int do_something(){
         printf("This is son, the pid is:%d, the a is: %d\n", getpid(), ++a);
         free(stack); //这里我也不清楚,如果这里不释放,不知道子线程死亡后,该内存是否会释放,知情者可以告诉下,谢谢
         exit(1);
 }
 int main() {
         void * stack;
         a = 1;
         stack = malloc(FIBER_STACK);//为子进程申请系统堆栈
         if(!stack) {
                 printf("The stack failed\n");
                 exit(0);
         }
         printf("creating son thread!!!\n");
         clone(&do_something, (char *)stack + FIBER_STACK, CLONE_VM|CLONE_VFORK, 0);//创建子线程
          printf("This is father, my pid is: %d, the a is: %d\n", getpid(), a);
          exit(1);
 }
 [root@liumengli program]# gcc test_clone.c -o test_clone
 [root@liumengli program]# ./test_clone
 creating son thread!!!
 This is son, the pid is:7326, the a is: 2
 This is father, my pid is: 7325, the a is: 2
  我在它的blog也给出了回复: 你上面提到的问题:
 free的问题。首先你可能疏忽了,stack变量在全局和局部都有定义的时候,子进程使用的是全局的变量stack。这样stack指针就与实际申请heap的stack不一致了。 使用Valgrind查看内存使用: ==14022== HEAP SUMMARY:
 ==14022==     in use at exit: 8,192 bytes in 1 blocks
 ==14022==   total heap usage: 1 allocs, 0 frees, 8,192 bytes allocated
 其实只需删除位于main函数中的void * stack;即可。这样就不会出现内存泄漏的情况了,另外,clone使用CLONE_VM父子进程之间共享内存。
 为了说明用下面的代码:
  #include <stdlib.h>
 #include "stdio.h"
 #include "sched.h"
 #include "signal.h"
 #define FIBER_STACK 8192
 int a;
 void * stack;
 int do_something(){
         printf("This is son, the pid is:%d, the a is: %d\n", getpid(), ++a);
         free(stack); //这里的内存不能得到释放
         exit(1);
 }
 int main() {
        // void * stack;
         a = 1;
         stack = malloc(FIBER_STACK);//为子进程申请系统堆栈
         if(!stack) {
                 printf("The stack failed\n");
                 exit(0);
         }

         printf("creating son thread!!!\n");

         clone(&do_something, (char *)stack + FIBER_STACK, CLONE_VM|CLONE_VFORK, 0);//创建子线程
          printf("This is father, my pid is: %d, the a is: %d\n", getpid(), a);
 //      free(stack);
          while(1) {
             sleep(1);
              printf("This is father, my pid is: %d, the a is: %d\n", getpid(), ++a);
          }
          exit(1);
 }

运行得到的结果为:
==14072==
==14072== HEAP SUMMARY:
==14072==     in use at exit: 8,192 bytes in 1 blocks
==14072==   total heap usage: 1 allocs, 0 frees, 8,192 bytes allocated
==14072==
==14072== LEAK SUMMARY:
==14072==    definitely lost: 0 bytes in 0 blocks
==14072==    indirectly lost: 0 bytes in 0 blocks
==14072==      possibly lost: 0 bytes in 0 blocks
==14072==    still reachable: 8,192 bytes in 1 blocks
==14072==         suppressed: 0 bytes in 0 blocks
==14072== Rerun with –leak-check=full to see details of leaked memory
==14072==
==14072== For counts of detected and suppressed errors, rerun with: -v
==14072== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7)
This is father, my pid is: 14071, the a is: 2
This is father, my pid is: 14071, the a is: 3
This is father, my pid is: 14071, the a is: 4
This is father, my pid is: 14071, the a is: 5
This is father, my pid is: 14071, the a is: 6
This is father, my pid is: 14071, the a is: 7
^C==14071==
==14071== HEAP SUMMARY:
==14071==     in use at exit: 8,192 bytes in 1 blocks
==14071==   total heap usage: 1 allocs, 0 frees, 8,192 bytes allocated

在执行子进程之后,VM重新回到父进程,在这个过程中,其实是内核在释放锁然后又加锁的过程。所以,VM是在子进程是不会释放的。只有在父进程调用free才能释放。内核的这种父子进程调度机制和内存管理方式,也是合理的。应该是谁申请谁释放。真理永远奏效。

From Binospace, post Linux系统调用篇1—vfork、fork、clone

文章的脚注信息由WordPress的wp-posturl插件自动生成





Do you have any comments on Linux系统调用篇1—vfork、fork、clone ?