By Louis Pelosi |

Overview

dCache is a sophisticated system that allows transparent access to files on disk or stored on magnetic tape drives in hierarchical storage managers (HSMs). dCache is a joint venture between the Deutsches Elektronen-Synchrotron (DESY), Fermi National Accelerator Laboratory (FNAL), and Nordic e-Infrastructure collaboration (NeIC).

dCache is capable of managing the storage and exchange of hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes. One key design feature is that although the location and multiplicity of data are autonomously determined by the system based on the configuration, CPU load, and disk space, the name-space is uniquely represented in a single file system tree. The system has been shown to improve the efficiency of connected tape storage systems significantly via caching, i.e., gather and flush, and scheduled staging techniques. Furthermore, it optimizes the throughput to and from data clients, as well as streamlining the load of connected disk storage nodes by dynamically replicating files upon the detection of hot spots. The system is tolerant against failures of its data servers, allowing administrators to go for commodity disk storage components. Access to the data is provided by xrootd protocol; various ftp dialects, including gridftp; and by a native protocol (dcap), offering regular file system operations such as open/read/write/seek/stat/close. From the users' perspective, the dCache directory (/pnfs) appears like any other cross-mounted file system. In addition, the software has an implementation of the Storage Resource Manager (SRM) protocal, which is evolving to an open standard for grid middleware to communicate with site-specific storage fabrics.[1]

[1] "Managed Data Storage and Data Access Services for Data Grids," M. Ernst, P. Fuhrmann, T. Mkrtchyan, DESY, Hamburg, Germany J. Bakken, I. Fisk, T. Perelmutov, D. Petravick, FNAL, Batavia, IL, USA. Link: https://dcache.org/old/manuals/chep04/chep04.michael.paper.pdf.

Introduction

Efficient use of resources should be one of the major aims in (almost) any environment. In distributed computer networks, file storage space and CPU cycles are two of these resources. The file storage space typically is divided into magnetic tape drives in HSMs and data disks connected to various computers. Examples of data sets stored on the HSMs are raw data files from experiments or backup of important user data, e.g., home directories. The data disks are used to store the data files engaged in batch analyses and user data. Usually, these data files cannot be transparently accessed by the users through one uniform mechanism. Different institutions, experiments, and user groups employ their own system to copy data from tape to disk and vice versa. One critical requirement of these systems is to identify if a file is on tape or disk. Appropriate actions must be taken to move a file from one storage medium to the other.

dCache provides users with one unique name-space for all the data files. The user does not know where a specific file is located physically. The system is maintained centrally and eliminates work previously done by local system administrators. Simultaneously, dCache can be tuned to the need of the experiments or user groups. Because the dCache is a distributed system that serves a number of disks, HSMs, and users, tuning will notably improve the system's performance. dCache works as an intermediate layer between the application hosts and the data stored on disks or taps on HSMs.

Requests to dCache may come from command-line tools, e.g., dcap, globus-url-copy, srmcp, or from applications. In both cases, the dCache manager is contacted through an interface (dCache door). The dCache manager determines the best source or destination pool or HSM for the request and contacts this pool. Finally, the selected pool reconnects to the client.

From the user's point of view, the file selection is done by name. File storage details, such as the physical location of the file on disk or HSM, are hidden from the users. This is done using PNFS (perfectly normal file system). dCache provides a directory-like structure for the users, which they can browse like a normal directory.

The dCache system's core is a clever selection mechanism that connects the HSMs to the data disks. When a request for a certain file is issued, the system determines if the file is already stored on one or more disks or on an HSM. The system determines the source or destination dCache pool. The selection is based on the file attributes and a set of preferences defined by the local system administrators. For a given request of a particular file, the pool selection is based on attributes of the file (storage group), the requesting host (network mask), and the dCache pools.[2]

[2] The Dcache Book. Link: https://dcache.org/old/manuals/book.shtml.

dCache Components and Duties

I/O Doors

Clients send requests for a datafile to a dCache system "door," which is a network server that performs user authentication and forwards client requests to the pool managers. There can be more than one type of door to a dCache system, each potentially handling a distinct authentication mechanism and, perhaps, residing on a separate host. Doors allow for multiple instances of the same kind of door running on different hosts for load sharing and fail-safes.

PnfsManager

The PNFS Filesystem mainly provides two services for the dCache. First, it serves as mountable filesystem presenting the file repository. Second, it is used by the dCache as metadata database for the file entries.

PoolManager

Each space request either for PUT or GET is handled by the PoolManager. It performs a pre-selection of possible pools and queries the selected pools for more information to optimize the final decision. Each Pool has to register itself to the PoolManager together with information about its affinity to certain storage classes and possibly about its topology and performance.

The Pool

The pool is responsible for a contiguous disk area, and it:

  • Monitors disk space
  • Holds a list of files that are candidates for removal if disk space is running short
  • Initiates the file copy process to and from tertiary storage
  • Connects to data clients for the data transfer
  • Monitors the total bandwidth to and from the disk area and adjusts the maximum number of movers.

Process for a Simple read-with-dCache Operation

  1. The client initiates a control connection with a dCache door, identified by its hostname and port number.
  2. The client requests a file from the cache via the control connection.
  3. The door authenticates the client then passes the file request to its pool managers in a round-robin fashion until it finds one that can deliver the file from its disk pools or determines that none of its pool managers can do it.
  4. If the file is in a disk pool, the door transfers control to that pool's manager, and a separate data connection is initiated. Then, the data file is sent to the client via the data connection.
  5. If the file is not in any pool, the door finds a pool that has space for the file and initiates a request to read that file from tape via HSM (e.g., HPSS) into that pool. Then, too, a data connection between that pool's manager and the client is established and the file is sent to the client.

Additional Resources