3 * @defgroup FileBackend File backend
6 * File backend is used to interact with file storage systems,
7 * such as the local file system, NFS, or cloud storage systems.
12 * @ingroup FileBackend
13 * @author Aaron Schulz
17 * @brief Base class for all file backend classes (including multi-write backends).
19 * This class defines the methods as abstract that subclasses must implement.
20 * Outside callers can assume that all backends will have these functions.
22 * All "storage paths" are of the format "mwstore://<backend>/<container>/<path>".
23 * The <path> portion is a relative path that uses UNIX file system (FS) notation,
24 * though any particular backend may not actually be using a local filesystem.
25 * Therefore, the relative paths are only virtual.
27 * Backend contents are stored under wiki-specific container names by default.
28 * For legacy reasons, this has no effect for the FS backend class, and per-wiki
29 * segregation must be done by setting the container paths appropriately.
31 * FS-based backends are somewhat more restrictive due to the existence of real
32 * directory files; a regular file cannot have the same name as a directory. Other
33 * backends with virtual directories may not have this limitation. Callers should
34 * store files in such a way that no files and directories are under the same path.
36 * Methods should avoid throwing exceptions at all costs.
37 * As a corollary, external dependencies should be kept to a minimum.
39 * @ingroup FileBackend
42 abstract class FileBackend
{
43 protected $name; // string; unique backend name
44 protected $wikiId; // string; unique wiki name
45 protected $readOnly; // string; read-only explanation message
46 /** @var LockManager */
47 protected $lockManager;
48 /** @var FileJournal */
49 protected $fileJournal;
52 * Create a new backend instance from configuration.
53 * This should only be called from within FileBackendGroup.
56 * 'name' : The unique name of this backend.
57 * This should consist of alphanumberic, '-', and '_' characters.
58 * This name should not be changed after use.
59 * 'wikiId' : Prefix to container names that is unique to this wiki.
60 * It should only consist of alphanumberic, '-', and '_' characters.
61 * 'lockManager' : Registered name of a file lock manager to use.
62 * 'fileJournal' : File journal configuration; see FileJournal::factory().
63 * Journals simply log changes to files stored in the backend.
64 * 'readOnly' : Write operations are disallowed if this is a non-empty string.
65 * It should be an explanation for the backend being read-only.
67 * @param $config Array
69 public function __construct( array $config ) {
70 $this->name
= $config['name'];
71 if ( !preg_match( '!^[a-zA-Z0-9-_]{1,255}$!', $this->name
) ) {
72 throw new MWException( "Backend name `{$this->name}` is invalid." );
74 $this->wikiId
= isset( $config['wikiId'] )
76 : wfWikiID(); // e.g. "my_wiki-en_"
77 $this->lockManager
= ( $config['lockManager'] instanceof LockManager
)
78 ?
$config['lockManager']
79 : LockManagerGroup
::singleton()->get( $config['lockManager'] );
80 $this->fileJournal
= isset( $config['fileJournal'] )
81 ? FileJournal
::factory( $config['fileJournal'], $this->name
)
82 : FileJournal
::factory( array( 'class' => 'NullFileJournal' ), $this->name
);
83 $this->readOnly
= isset( $config['readOnly'] )
84 ?
(string)$config['readOnly']
89 * Get the unique backend name.
90 * We may have multiple different backends of the same type.
91 * For example, we can have two Swift backends using different proxies.
95 final public function getName() {
100 * Check if this backend is read-only
104 final public function isReadOnly() {
105 return ( $this->readOnly
!= '' );
109 * Get an explanatory message if this backend is read-only
111 * @return string|bool Returns false if the backend is not read-only
113 final public function getReadOnlyReason() {
114 return ( $this->readOnly
!= '' ) ?
$this->readOnly
: false;
118 * This is the main entry point into the backend for write operations.
119 * Callers supply an ordered list of operations to perform as a transaction.
120 * Files will be locked, the stat cache cleared, and then the operations attempted.
121 * If any serious errors occur, all attempted operations will be rolled back.
123 * $ops is an array of arrays. The outer array holds a list of operations.
124 * Each inner array is a set of key value pairs that specify an operation.
126 * Supported operations and their parameters:
127 * a) Create a new file in storage with the contents of a string
130 * 'dst' => <storage path>,
131 * 'content' => <string of new file contents>,
132 * 'overwrite' => <boolean>,
133 * 'overwriteSame' => <boolean>
135 * b) Copy a file system file into storage
138 * 'src' => <file system path>,
139 * 'dst' => <storage path>,
140 * 'overwrite' => <boolean>,
141 * 'overwriteSame' => <boolean>
143 * c) Copy a file within storage
146 * 'src' => <storage path>,
147 * 'dst' => <storage path>,
148 * 'overwrite' => <boolean>,
149 * 'overwriteSame' => <boolean>
151 * d) Move a file within storage
154 * 'src' => <storage path>,
155 * 'dst' => <storage path>,
156 * 'overwrite' => <boolean>,
157 * 'overwriteSame' => <boolean>
159 * e) Delete a file within storage
162 * 'src' => <storage path>,
163 * 'ignoreMissingSource' => <boolean>
165 * f) Do nothing (no-op)
170 * Boolean flags for operations (operation-specific):
171 * 'ignoreMissingSource' : The operation will simply succeed and do
172 * nothing if the source file does not exist.
173 * 'overwrite' : Any destination file will be overwritten.
174 * 'overwriteSame' : An error will not be given if a file already
175 * exists at the destination that has the same
176 * contents as the new contents to be written there.
178 * $opts is an associative of boolean flags, including:
179 * 'force' : Operation precondition errors no longer trigger an abort.
180 * Any remaining operations are still attempted. Unexpected
181 * failures may still cause remaning operations to be aborted.
182 * 'nonLocking' : No locks are acquired for the operations.
183 * This can increase performance for non-critical writes.
184 * This has no effect unless the 'force' flag is set.
185 * 'allowStale' : Don't require the latest available data.
186 * This can increase performance for non-critical writes.
187 * This has no effect unless the 'force' flag is set.
188 * 'nonJournaled' : Don't log this operation batch in the file journal.
189 * This limits the ability of recovery scripts.
191 * Remarks on locking:
192 * File system paths given to operations should refer to files that are
193 * already locked or otherwise safe from modification from other processes.
194 * Normally these files will be new temp files, which should be adequate.
197 * This returns a Status, which contains all warnings and fatals that occured
198 * during the operation. The 'failCount', 'successCount', and 'success' members
199 * will reflect each operation attempted. The status will be "OK" unless:
200 * a) unexpected operation errors occurred (network partitions, disk full...)
201 * b) significant operation errors occured and 'force' was not set
203 * @param $ops Array List of operations to execute in order
204 * @param $opts Array Batch operation options
207 final public function doOperations( array $ops, array $opts = array() ) {
208 if ( $this->isReadOnly() ) {
209 return Status
::newFatal( 'backend-fail-readonly', $this->name
, $this->readOnly
);
211 if ( empty( $opts['force'] ) ) { // sanity
212 unset( $opts['nonLocking'] );
213 unset( $opts['allowStale'] );
215 return $this->doOperationsInternal( $ops, $opts );
219 * @see FileBackend::doOperations()
221 abstract protected function doOperationsInternal( array $ops, array $opts );
224 * Same as doOperations() except it takes a single operation.
225 * If you are doing a batch of operations that should either
226 * all succeed or all fail, then use that function instead.
228 * @see FileBackend::doOperations()
230 * @param $op Array Operation
231 * @param $opts Array Operation options
234 final public function doOperation( array $op, array $opts = array() ) {
235 return $this->doOperations( array( $op ), $opts );
239 * Performs a single create operation.
240 * This sets $params['op'] to 'create' and passes it to doOperation().
242 * @see FileBackend::doOperation()
244 * @param $params Array Operation parameters
245 * @param $opts Array Operation options
248 final public function create( array $params, array $opts = array() ) {
249 $params['op'] = 'create';
250 return $this->doOperation( $params, $opts );
254 * Performs a single store operation.
255 * This sets $params['op'] to 'store' and passes it to doOperation().
257 * @see FileBackend::doOperation()
259 * @param $params Array Operation parameters
260 * @param $opts Array Operation options
263 final public function store( array $params, array $opts = array() ) {
264 $params['op'] = 'store';
265 return $this->doOperation( $params, $opts );
269 * Performs a single copy operation.
270 * This sets $params['op'] to 'copy' and passes it to doOperation().
272 * @see FileBackend::doOperation()
274 * @param $params Array Operation parameters
275 * @param $opts Array Operation options
278 final public function copy( array $params, array $opts = array() ) {
279 $params['op'] = 'copy';
280 return $this->doOperation( $params, $opts );
284 * Performs a single move operation.
285 * This sets $params['op'] to 'move' and passes it to doOperation().
287 * @see FileBackend::doOperation()
289 * @param $params Array Operation parameters
290 * @param $opts Array Operation options
293 final public function move( array $params, array $opts = array() ) {
294 $params['op'] = 'move';
295 return $this->doOperation( $params, $opts );
299 * Performs a single delete operation.
300 * This sets $params['op'] to 'delete' and passes it to doOperation().
302 * @see FileBackend::doOperation()
304 * @param $params Array Operation parameters
305 * @param $opts Array Operation options
308 final public function delete( array $params, array $opts = array() ) {
309 $params['op'] = 'delete';
310 return $this->doOperation( $params, $opts );
314 * Concatenate a list of storage files into a single file system file.
315 * The target path should refer to a file that is already locked or
316 * otherwise safe from modification from other processes. Normally,
317 * the file will be a new temp file, which should be adequate.
319 * srcs : ordered source storage paths (e.g. chunk1, chunk2, ...)
320 * dst : file system path to 0-byte temp file
322 * @param $params Array Operation parameters
325 abstract public function concatenate( array $params );
328 * Prepare a storage directory for usage.
329 * This will create any required containers and parent directories.
330 * Backends using key/value stores only need to create the container.
333 * dir : storage directory
335 * @param $params Array
338 final public function prepare( array $params ) {
339 if ( $this->isReadOnly() ) {
340 return Status
::newFatal( 'backend-fail-readonly', $this->name
, $this->readOnly
);
342 return $this->doPrepare( $params );
346 * @see FileBackend::prepare()
348 abstract protected function doPrepare( array $params );
351 * Take measures to block web access to a storage directory and
352 * the container it belongs to. FS backends might add .htaccess
353 * files whereas key/value store backends might restrict container
354 * access to the auth user that represents end-users in web request.
355 * This is not guaranteed to actually do anything.
358 * dir : storage directory
359 * noAccess : try to deny file access
360 * noListing : try to deny file listing
362 * @param $params Array
365 final public function secure( array $params ) {
366 if ( $this->isReadOnly() ) {
367 return Status
::newFatal( 'backend-fail-readonly', $this->name
, $this->readOnly
);
369 $status = $this->doPrepare( $params ); // dir must exist to restrict it
370 if ( $status->isOK() ) {
371 $status->merge( $this->doSecure( $params ) );
377 * @see FileBackend::secure()
379 abstract protected function doSecure( array $params );
382 * Delete a storage directory if it is empty.
383 * Backends using key/value stores may do nothing unless the directory
384 * is that of an empty container, in which case it should be deleted.
387 * dir : storage directory
388 * recursive : recursively delete empty subdirectories first (@since 1.20)
390 * @param $params Array
393 final public function clean( array $params ) {
394 if ( $this->isReadOnly() ) {
395 return Status
::newFatal( 'backend-fail-readonly', $this->name
, $this->readOnly
);
397 return $this->doClean( $params );
401 * @see FileBackend::clean()
403 abstract protected function doClean( array $params );
406 * Check if a file exists at a storage path in the backend.
407 * This returns false if only a directory exists at the path.
410 * src : source storage path
411 * latest : use the latest available data
413 * @param $params Array
414 * @return bool|null Returns null on failure
416 abstract public function fileExists( array $params );
419 * Get the last-modified timestamp of the file at a storage path.
422 * src : source storage path
423 * latest : use the latest available data
425 * @param $params Array
426 * @return string|bool TS_MW timestamp or false on failure
428 abstract public function getFileTimestamp( array $params );
431 * Get the contents of a file at a storage path in the backend.
432 * This should be avoided for potentially large files.
435 * src : source storage path
436 * latest : use the latest available data
438 * @param $params Array
439 * @return string|bool Returns false on failure
441 abstract public function getFileContents( array $params );
444 * Get the size (bytes) of a file at a storage path in the backend.
447 * src : source storage path
448 * latest : use the latest available data
450 * @param $params Array
451 * @return integer|bool Returns false on failure
453 abstract public function getFileSize( array $params );
456 * Get quick information about a file at a storage path in the backend.
457 * If the file does not exist, then this returns false.
458 * Otherwise, the result is an associative array that includes:
459 * mtime : the last-modified timestamp (TS_MW)
460 * size : the file size (bytes)
461 * Additional values may be included for internal use only.
464 * src : source storage path
465 * latest : use the latest available data
467 * @param $params Array
468 * @return Array|bool|null Returns null on failure
470 abstract public function getFileStat( array $params );
473 * Get a SHA-1 hash of the file at a storage path in the backend.
476 * src : source storage path
477 * latest : use the latest available data
479 * @param $params Array
480 * @return string|bool Hash string or false on failure
482 abstract public function getFileSha1Base36( array $params );
485 * Get the properties of the file at a storage path in the backend.
486 * Returns FSFile::placeholderProps() on failure.
489 * src : source storage path
490 * latest : use the latest available data
492 * @param $params Array
495 abstract public function getFileProps( array $params );
498 * Stream the file at a storage path in the backend.
499 * If the file does not exists, a 404 error will be given.
500 * Appropriate HTTP headers (Status, Content-Type, Content-Length)
501 * must be sent if streaming began, while none should be sent otherwise.
502 * Implementations should flush the output buffer before sending data.
505 * src : source storage path
506 * headers : additional HTTP headers to send on success
507 * latest : use the latest available data
509 * @param $params Array
512 abstract public function streamFile( array $params );
515 * Returns a file system file, identical to the file at a storage path.
516 * The file returned is either:
517 * a) A local copy of the file at a storage path in the backend.
518 * The temporary copy will have the same extension as the source.
519 * b) An original of the file at a storage path in the backend.
520 * Temporary files may be purged when the file object falls out of scope.
522 * Write operations should *never* be done on this file as some backends
523 * may do internal tracking or may be instances of FileBackendMultiWrite.
524 * In that later case, there are copies of the file that must stay in sync.
525 * Additionally, further calls to this function may return the same file.
528 * src : source storage path
529 * latest : use the latest available data
531 * @param $params Array
532 * @return FSFile|null Returns null on failure
534 abstract public function getLocalReference( array $params );
537 * Get a local copy on disk of the file at a storage path in the backend.
538 * The temporary copy will have the same file extension as the source.
539 * Temporary files may be purged when the file object falls out of scope.
542 * src : source storage path
543 * latest : use the latest available data
545 * @param $params Array
546 * @return TempFSFile|null Returns null on failure
548 abstract public function getLocalCopy( array $params );
551 * Check if a directory exists at a given storage path.
552 * Backends using key/value stores will check if the path is a
553 * virtual directory, meaning there are files under the given directory.
555 * Storage backends with eventual consistency might return stale data.
558 * dir : storage directory
560 * @return bool|null Returns null on failure
563 abstract public function directoryExists( array $params );
566 * Get an iterator to list *all* directories under a storage directory.
567 * If the directory is of the form "mwstore://backend/container",
568 * then all directories in the container should be listed.
569 * If the directory is of form "mwstore://backend/container/dir",
570 * then all directories directly under that directory should be listed.
571 * Results should be storage directories relative to the given directory.
573 * Storage backends with eventual consistency might return stale data.
576 * dir : storage directory
577 * topOnly : only return direct child dirs of the directory
579 * @return Traversable|Array|null Returns null on failure
582 abstract public function getDirectoryList( array $params );
585 * Same as FileBackend::getDirectoryList() except only lists
586 * directories that are immediately under the given directory.
588 * Storage backends with eventual consistency might return stale data.
591 * dir : storage directory
593 * @return Traversable|Array|null Returns null on failure
596 final public function getTopDirectoryList( array $params ) {
597 return $this->getDirectoryList( array( 'topOnly' => true ) +
$params );
601 * Get an iterator to list *all* stored files under a storage directory.
602 * If the directory is of the form "mwstore://backend/container",
603 * then all files in the container should be listed.
604 * If the directory is of form "mwstore://backend/container/dir",
605 * then all files under that directory should be listed.
606 * Results should be storage paths relative to the given directory.
608 * Storage backends with eventual consistency might return stale data.
611 * dir : storage directory
612 * topOnly : only return direct child files of the directory (@since 1.20)
614 * @return Traversable|Array|null Returns null on failure
616 abstract public function getFileList( array $params );
619 * Same as FileBackend::getFileList() except only lists
620 * files that are immediately under the given directory.
622 * Storage backends with eventual consistency might return stale data.
625 * dir : storage directory
627 * @return Traversable|Array|null Returns null on failure
630 final public function getTopFileList( array $params ) {
631 return $this->getFileList( array( 'topOnly' => true ) +
$params );
635 * Invalidate any in-process file existence and property cache.
636 * If $paths is given, then only the cache for those files will be cleared.
638 * @param $paths Array Storage paths (optional)
641 public function clearCache( array $paths = null ) {}
644 * Lock the files at the given storage paths in the backend.
645 * This will either lock all the files or none (on failure).
647 * Callers should consider using getScopedFileLocks() instead.
649 * @param $paths Array Storage paths
650 * @param $type integer LockManager::LOCK_* constant
653 final public function lockFiles( array $paths, $type ) {
654 return $this->lockManager
->lock( $paths, $type );
658 * Unlock the files at the given storage paths in the backend.
660 * @param $paths Array Storage paths
661 * @param $type integer LockManager::LOCK_* constant
664 final public function unlockFiles( array $paths, $type ) {
665 return $this->lockManager
->unlock( $paths, $type );
669 * Lock the files at the given storage paths in the backend.
670 * This will either lock all the files or none (on failure).
671 * On failure, the status object will be updated with errors.
673 * Once the return value goes out scope, the locks will be released and
674 * the status updated. Unlock fatals will not change the status "OK" value.
676 * @param $paths Array Storage paths
677 * @param $type integer LockManager::LOCK_* constant
678 * @param $status Status Status to update on lock/unlock
679 * @return ScopedLock|null Returns null on failure
681 final public function getScopedFileLocks( array $paths, $type, Status
$status ) {
682 return ScopedLock
::factory( $this->lockManager
, $paths, $type, $status );
686 * Get the root storage path of this backend.
687 * All container paths are "subdirectories" of this path.
689 * @return string Storage path
692 final public function getRootStoragePath() {
693 return "mwstore://{$this->name}";
697 * Check if a given path is a "mwstore://" path.
698 * This does not do any further validation or any existence checks.
700 * @param $path string
703 final public static function isStoragePath( $path ) {
704 return ( strpos( $path, 'mwstore://' ) === 0 );
708 * Split a storage path into a backend name, a container name,
709 * and a relative file path. The relative path may be the empty string.
710 * This does not do any path normalization or traversal checks.
712 * @param $storagePath string
713 * @return Array (backend, container, rel object) or (null, null, null)
715 final public static function splitStoragePath( $storagePath ) {
716 if ( self
::isStoragePath( $storagePath ) ) {
717 // Remove the "mwstore://" prefix and split the path
718 $parts = explode( '/', substr( $storagePath, 10 ), 3 );
719 if ( count( $parts ) >= 2 && $parts[0] != '' && $parts[1] != '' ) {
720 if ( count( $parts ) == 3 ) {
721 return $parts; // e.g. "backend/container/path"
723 return array( $parts[0], $parts[1], '' ); // e.g. "backend/container"
727 return array( null, null, null );
731 * Normalize a storage path by cleaning up directory separators.
732 * Returns null if the path is not of the format of a valid storage path.
734 * @param $storagePath string
735 * @return string|null
737 final public static function normalizeStoragePath( $storagePath ) {
738 list( $backend, $container, $relPath ) = self
::splitStoragePath( $storagePath );
739 if ( $relPath !== null ) { // must be for this backend
740 $relPath = self
::normalizeContainerPath( $relPath );
741 if ( $relPath !== null ) {
742 return ( $relPath != '' )
743 ?
"mwstore://{$backend}/{$container}/{$relPath}"
744 : "mwstore://{$backend}/{$container}";
751 * Get the parent storage directory of a storage path.
752 * This returns a path like "mwstore://backend/container",
753 * "mwstore://backend/container/...", or null if there is no parent.
755 * @param $storagePath string
756 * @return string|null
758 final public static function parentStoragePath( $storagePath ) {
759 $storagePath = dirname( $storagePath );
760 list( $b, $cont, $rel ) = self
::splitStoragePath( $storagePath );
761 return ( $rel === null ) ?
null : $storagePath;
765 * Get the final extension from a storage or FS path
767 * @param $path string
770 final public static function extensionFromPath( $path ) {
771 $i = strrpos( $path, '.' );
772 return strtolower( $i ?
substr( $path, $i +
1 ) : '' );
776 * Check if a relative path has no directory traversals
778 * @param $path string
782 final public static function isPathTraversalFree( $path ) {
783 return ( self
::normalizeContainerPath( $path ) !== null );
787 * Validate and normalize a relative storage path.
788 * Null is returned if the path involves directory traversal.
789 * Traversal is insecure for FS backends and broken for others.
791 * This uses the same traversal protection as Title::secureAndSplit().
793 * @param $path string Storage path relative to a container
794 * @return string|null
796 final protected static function normalizeContainerPath( $path ) {
797 // Normalize directory separators
798 $path = strtr( $path, '\\', '/' );
799 // Collapse any consecutive directory separators
800 $path = preg_replace( '![/]{2,}!', '/', $path );
801 // Remove any leading directory separator
802 $path = ltrim( $path, '/' );
803 // Use the same traversal protection as Title::secureAndSplit()
804 if ( strpos( $path, '.' ) !== false ) {
808 strpos( $path, './' ) === 0 ||
809 strpos( $path, '../' ) === 0 ||
810 strpos( $path, '/./' ) !== false ||
811 strpos( $path, '/../' ) !== false