Both GFS and TFS (taobao file system) uses single master node with multiple slaves.
My comments: facebook does not use such approaches? I think it uses P2P approach.
Differences
Functions
GFS/HDFS are more popular
On top of it, we can build the big tables such as BigTable, Hypertable, HBase.
Blob File System
Usually used for Photos, Albums (These are called Blob Data)
Challenges
Blob FS
For each write, it will request the master node to assign a blob number and machine lists to write to.
Challenge
The volume of meta-data is of huge size
E.g., Taobao has more than 10G photos, assume each photo has meta-data of size 20Bytes, the total size will be 20*10 = 200G, much more than the memory of a single machine.
Solution
The meta-data is not stored in Blob FS.
The meta-data are stored in external systems.
e.g., Taobao TFS has an id for each photo, the id are stored in external databases, such as Oracle or Mysql sharding cluster.
Blob FS use chunk to organize data.
Every blob file is a logical file.
Every chuck in a physical file.
Multiple logical file will share a physical file, so that it can reduces the number of physical files.
All meta-data for physical files can be in memory, thus every read of Blob file only needs one I/O access.
HDFS/GFS
GFS v2 may be able to combine GFS and Blob FS into a system. It is difficult to do so, since
It needs to support both large and small files.
The size of meta-data is too large and thus the master nodes also need to be distributed.