US11392538B2 - Archiving data objects using secondary copies - Google Patents

Archiving data objects using secondary copies Download PDF

Info

Publication number
US11392538B2
US11392538B2 US16/934,432 US202016934432A US11392538B2 US 11392538 B2 US11392538 B2 US 11392538B2 US 202016934432 A US202016934432 A US 202016934432A US 11392538 B2 US11392538 B2 US 11392538B2
Authority
US
United States
Prior art keywords
data
data object
storage
secondary copy
copy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US16/934,432
Other versions
US20200349107A1 (en
Inventor
Parag Gokhale
Rajiv Kottomtharayil
Prakash Varadharajan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Commvault Systems Inc
Original Assignee
Commvault Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Commvault Systems Inc filed Critical Commvault Systems Inc
Priority to US16/934,432 priority Critical patent/US11392538B2/en
Assigned to COMMVAULT SYSTEMS, INC. reassignment COMMVAULT SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GOKHALE, PARAG, KOTTOMTHARAYIL, RAJIV, VARADHARAJAN, PRAKASH
Publication of US20200349107A1 publication Critical patent/US20200349107A1/en
Assigned to JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT reassignment JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COMMVAULT SYSTEMS, INC.
Priority to US17/841,575 priority patent/US11768800B2/en
Application granted granted Critical
Publication of US11392538B2 publication Critical patent/US11392538B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24544Join order optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques

Definitions

  • a primary copy of data is generally a production copy or other “live” version of the data which is used by a software application and is generally in the native format of that application.
  • Primary copy data may be maintained in a local memory or other high-speed storage device that allows for relatively fast data access if necessary.
  • Such primary copy data is typically intended for short term retention (e.g., several hours or days) before some or all of the data is stored as one or more secondary copies, for example, to prevent loss of data in the event a problem occurred with the data stored in primary storage.
  • secondary copies can be made.
  • data protection copies include a backup copy, a snapshot copy, a hierarchical storage management (“HSM”) copy, an archive copy, and other types of copies.
  • HSM hierarchical storage management
  • a backup copy is generally a point-in-time copy of the primary copy data stored in a backup format as opposed to in native application format.
  • a backup copy may be stored in a backup format that is optimized for compression and efficient long-term storage.
  • Backup copies generally have relatively long retention periods and may be stored on media with slower retrieval times than other types of secondary copies and media. In some cases, backup copies may be stored at an offsite location.
  • each incremental backup operation copies only the primary copy data that has changed since the last full or incremental backup of the data set was performed. In this way, even if the entire set of primary copy data that is backed up is large, the amount of data that must be transferred during each incremental backup operation may be significantly smaller, since only the changed data needs to be transferred to secondary storage.
  • one or more full backup and subsequent incremental copies may be utilized together to periodically or intermittently create a synthetic full backup copy. More details regarding synthetic storage operations are found in commonly-assigned U.S. patent application Ser. No. 12/510,059, entitled “Snapshot Storage and Management System with Indexing and User Interface,” filed Jul. 27, 2009, now U.S. Pat. No. 7,873,806, which is hereby incorporated herein in its entirety.
  • An archive copy is generally a copy of the primary copy data, but typically includes only a subset of the primary copy data that meets certain criteria and is usually stored in a format other than the native application format.
  • an archive copy might include only that data from the primary copy that is larger than a given size threshold or older than a given age threshold and that is stored in a backup format.
  • archive data is removed from the primary copy, and a stub is stored in the primary copy to indicate its new location.
  • systems use the stub to locate the data and often make recovery of the data appear transparent, even though the archive data may be stored at a location different from the remaining primary copy data.
  • Archive copies are typically created and tracked independently of other secondary copies, such as other backup copies.
  • the data storage system transfers a secondary copy of primary copy data to secondary storage and tracks the backup copy using a backup index separate from the archive index.
  • a conventional data storage system transfers the primary copy data to be archived to secondary storage to create an archive copy, replaces the primary copy data with a stub, and tracks the archive copy using an archive index. Accordingly, the data storage system will transfer two separate times to secondary storage a primary copy data object that is both archived and backed-up.
  • the data storage system may not be able to devote such resources to other tasks. Moreover, the data storage system is required to devote resources to maintaining each separate index.
  • the archive index may be unaware of the other secondary copy and the other secondary index may be unaware of the archive copy, which may lead to further inefficiencies.
  • the archive index in the event that an archive copy is moved or transferred (e.g., to another tier of secondary storage), the archive index may not be able to be updated to reflect the move or transfer. In such cases, the data storage system may be unable to use the stub to locate the archived data object.
  • archiving operations may require the transfer of large quantities of data during a single archive operation.
  • the retention criteria for an organization may specify that data objects more than two years old should be archived.
  • the organization may amass large quantities of data.
  • the first archive operation finally occurs, e.g., approximately two years into the operation of the organization, it may be necessary to transfer a large amount of the organization's data.
  • backup, archive, and other secondary storage operations may unnecessarily preserve secondary copies of data created from primary data that has been deleted or is otherwise no longer being actively used as production data by a computing system, such as a workstation or server.
  • secondary storage requirements may increasingly and unnecessarily bloat over time.
  • FIG. 1 is a block diagram illustrating an environment in which a system for archiving data objects using secondary copies operates.
  • FIG. 2 is a flow diagram illustrating a process implemented by the system in connection with archiving data objects using secondary copies.
  • FIG. 3 is a flow diagram illustrating a process implemented by the system in connection with reclaiming space used to store secondary copies.
  • FIGS. 4A-4C are data structure diagrams illustrating data structures used by the system.
  • FIG. 5 is a block diagram illustrating a data storage system in which the system operates.
  • a software, firmware, and/or hardware system for archiving data objects using secondary copies (the “system”) is disclosed.
  • the system creates one or more secondary copies of primary copy data (e.g., production data stored by a production computing system).
  • the primary copy data contains multiple data objects (e.g., multiple files, emails, or other logical groupings or collections of data).
  • the system maintains a first data structure that tracks the data objects for which the system has created secondary copies and the locations of the secondary copies.
  • the system applies rules to determine which data objects are to be archived.
  • the system verifies that previously-created secondary copies of data objects to be archived exist and replaces the data objects with stubs, pointers or logical addresses.
  • the system maintains a second data structure that both tracks the stubs and refers to the first data structure, thereby creating an association between the stubs and the locations of the secondary copies.
  • the system archives data objects without creating an additional or other secondary copy of the data objects.
  • the association between the two data structures allows stubs to point to or refer to the previously-created secondary copy of the data objects. Accordingly, the existence of the previously-created secondary copy of the data objects allows the system to forego creating an additional or other secondary copy of the data objects, thereby saving resources.
  • the system may also perform a process to reclaim space used to store secondary copies. To do so, the system scans or analyzes the primary copy data to identify the data objects that exist in the primary copy data and stores the results of the scan or analysis in a third data structure. The system then compares the first and third data structures (e.g., the system performs a difference of the first and third data structures) to determine which data objects in the primary copy data have been deleted. For each deleted data object, the system updates the corresponding entry in the first data structure. Then the system accesses the first data structure and determines 1) which data objects in the primary copy data have not been deleted and 2) which have been deleted, but whose deletion occurred less than a predetermined period of time ago.
  • the system scans or analyzes the primary copy data to identify the data objects that exist in the primary copy data and stores the results of the scan or analysis in a third data structure.
  • the system compares the first and third data structures (e.g., the system performs a difference of the first and third data structures) to determine which
  • the system For each data object determined in this fashion, the system then creates, from the first secondary copy of the data object, a second secondary copy of the data object.
  • the system can then create a new first data structure or update the existing first data structure to reflect the second secondary copies of the data objects.
  • FIG. 1 is a block diagram illustrating an environment 100 in which the system may operate.
  • the environment 100 includes one or more clients 130 , one or more primary data stores 160 , a secondary storage computing device 165 (alternatively referred to as a “media agent”), and one or more storage devices 115 .
  • Each of the clients 130 is a computing device, examples of which are described herein.
  • Clients may be, as non-exclusive examples, servers, workstations, personal computers, computerized tablets, PDAs, smart phones, or other computers having social networking data, such as a Facebook data.
  • the clients 130 are each connected to one or more associated primary data stores 160 and to the secondary storage computing device 165 .
  • the secondary storage computing device 165 is connected to the storage device 115 .
  • the primary data stores 160 and storage device 115 may each be any type of storage suitable for storing data, such as Directly-Attached Storage (DAS) such as hard disks, a Storage Area Network (SAN), e.g., a Fibre Channel SAN, an iSCSI SAN or other type of SAN, Network-Attached Storage (NAS), a tape library, or any other type of storage.
  • DAS Directly-Attached Storage
  • SAN Storage Area Network
  • NAS Network-Attached Storage
  • the clients 130 and the secondary storage computing device 165 typically include application software to perform desired operations and an operating system on which the application software runs.
  • the clients 130 and the secondary storage computing device 165 typically also include a file system that facilitates and controls file access by the operating system and application software. The file system facilitates access to local and remote storage devices for file or data access and storage.
  • the clients 130 utilize data, which includes files, directories, metadata (e.g., ACLs, descriptive metadata, and any other streams associated with the data), and other data objects, which may be stored in the primary data store 160 .
  • the data of a client 130 is generally a primary copy (e.g., a production copy).
  • a client 130 may in fact be a production server, such as a file server or Exchange server, which provides live production data to multiple user workstations as part of its function.
  • Each client 130 includes a data agent 195 (described in more detail with reference to FIG. 5 ).
  • the data agents 195 send a copy of data objects in a primary data store 160 to the secondary storage computing device 165 .
  • the secondary storage computing device 165 includes a memory 114 .
  • the memory 114 includes software 116 incorporating components 118 and data 119 typically used by the system.
  • the components 118 include a secondary copy component 128 that performs secondary copy operations and a pruning component 129 that performs space reclamation or pruning operations.
  • the data 119 includes secondary copy data structure 122 , stubs data structure 124 , and primary copy data structure 126 .
  • the system uses the data 119 to, among other things, track data objects copied during archive and other secondary copy operations and to track data objects in primary copy data.
  • While items 118 and 119 are illustrated as stored in memory 114 , those skilled in the art will appreciate that these items, or portions of them, may be transferred between memory 114 and a persistent storage device 106 (for example, a magnetic hard drive, a tape of a tape library, etc.) for purposes of memory management, data integrity, and/or other purposes.
  • a persistent storage device 106 for example, a magnetic hard drive, a tape of a tape library, etc.
  • the secondary storage computing device 165 further includes one or more central processing units (CPU) 102 for executing software 116 , and a computer-readable media drive 104 for reading information or installing software 116 from tangible computer-readable storage media, such as a floppy disk, a CD-ROM, a DVD, a USB flash drive, and/or other tangible computer-readable storage media.
  • the secondary storage computing device 165 also includes one or more of the following: a network connection device 108 for connecting to a network, an information input device 110 (for example, a mouse, a keyboard, etc.), and an information output device 112 (for example, a display).
  • FIG. 2 is a flow diagram illustrating a process 200 implemented by the system in connection with archiving data objects using secondary copies in some examples.
  • the process 200 begins at step 205 , where the system creates a full secondary copy of the primary copy data of a client 130 , by creating a secondary copy of the entire primary copy data and transferring the secondary copy to the storage device 115 .
  • the system may also create one or more incremental copies of the primary copy data by transferring only the primary copy data that has changed since the time of the full copy or a previous incremental copy. For example, the system may perform only a single full backup of all the primary copy data that is to be protected (as defined, for example, by a storage policy or other criteria) and store the full backup on the storage device 115 .
  • the system may then create weekly, daily, periodic, intermittent or continuous incremental backup copies of only the primary copy data that has changed since the system performed the last backup operation.
  • the system may use one or more of the full backup, incremental backups, and/or previous synthetic full backups to generate a new synthetic full backup copy via a synthetic full operation.
  • the system may process data objects that have been deleted from the primary copy of the data and remove these data objects from the synthetic full copy.
  • the generation of a new synthetic full backup copy or other synthetic full operation requires reading one or more previous backup copies or other types of secondary copies, rehydrating or decompressing the previous secondary copy or copies, and re-deduplicating the previous secondary copy or copies.
  • the generation of a new synthetic full backup copy or other synthetic operation does not require reading, rehydrating, or re-deduplicating a previous backup or other secondary copy. Instead, reference counts may be updated and metadata may be added to the synthetic full copy.
  • FIG. 4A is a data structure diagram illustrating the secondary copy data structure 122 .
  • the secondary copy data structure 122 contains rows, such as rows 425 a and 425 b , each divided into the following columns: an ID column 405 containing an identifier of a data object (e.g., a globally unique identifier—GUID), a primary copy location column 410 containing the location of the primary copy of the data object, a secondary copy location column 415 containing the location of the secondary copy of the data object, and a deletion time column 420 containing a time stamp of when the primary copy of the data object was deleted.
  • the secondary copy data structure 122 may also include other columns that may contain additional data about data objects.
  • the system may additionally or alternatively use relative locations to indicate the locations of data objects in the secondary copy data structure 122 .
  • the system may store secondary copies of data objects using a logical archive file and specify a relative location within the logical archive file for a secondary copy location.
  • the system may store secondary copies of data objects on tape and specify a tape and an offset within the tape for a secondary copy location.
  • FIG. 4A illustrates entries corresponding to files in the secondary copy data structure 122
  • the disclosed techniques may also be used with other types of data objects, such as emails and email attachments, database or spreadsheet objects, data blocks, and other data objects stored in other data repositories. Accordingly, the disclosure is not to be construed as limited solely to files.
  • the system may utilize a single secondary copy data structure 122 for each client 130 (or subclient thereof) or for each set of data subject to data protection operations, which may be the data of a single client 130 or the data of multiple clients 130 . Additionally or alternatively, the system may use a single secondary copy data structure 122 for multiple clients 130 or for multiple sets of data subject to data protection operations, which may be the data of a single client 130 or the data of multiple clients 130 . In such a case, the secondary copy data structure 122 may contain additional columns containing data that allows for differentiation of data associated with different clients 130 or different sets of data.
  • the system In adding entries for each new copy of a data object, the system adds a new row 425 to the secondary copy data structure 122 .
  • the system may generate the identifier for each secondary copy of a data object created and, in the new row 425 , add the identifier to column 405 , add the primary copy location of the data object to column 410 , and add the secondary copy location to column 415 .
  • the system may also store additional data as part of step 210 , such as in other columns of the secondary copy data structure 122 or in other data structures.
  • the system identifies data objects in the primary copy data that are to be archived. For example, the system may apply one or more rules or criteria based on any combination of data object type, data object age, data object size, percentage of disk quota, remaining storage, metadata (e.g., a flag or tag indicating importance) and/or other factors.
  • the system verifies that a secondary copy of each data object has been made. To do so, the system may access the secondary copy data structure 122 to determine that secondary copies of the identified data objects exist. Also at step 220 , the system obtains a token for each identified data object. The token represents confirmation or verification that a secondary copy of a data object was previously created, and is typically unique for each data object.
  • the system replaces each of the identified data objects in the primary copy data with a stub containing the token.
  • the stub is typically a small data object that indicates, points to, or refers to the location of the secondary copy of the data object and facilitates recovery of the data object. More details as to archiving operations may be found in the commonly-assigned currently pending U.S. Patent Application Number 2008/0229037, the entirety of which is incorporated by reference herein.
  • FIG. 4B is a data structure diagram illustrating the stubs data structure 124 .
  • the stubs data structure 124 contains rows, such as rows 465 a and 465 b , each divided into the following columns: an ID column 455 containing the identifier of a data object (e.g., the GUID) and a token column 460 containing the token previously created or generated for the data object.
  • the stubs data structure 124 may also include other columns that may contain additional data about data objects.
  • the system may utilize a single stubs data structure 124 for a single data objects data structure 122 , a single stubs data structure 124 for multiple data objects data structures 122 , and/or multiple stubs data structures 124 for multiple data objects data structures 122 .
  • the system adds a new row 465 to the stubs data structure 124 .
  • the system adds the identifier that corresponds to the data object associated with the stub to column 455 and the token obtained in step 220 to column 460 .
  • the system may also store additional data as part of step 235 , such as in other columns of the stubs data structure 124 or in other data structures.
  • the entries in rows 465 a and 465 b indicate that the system archived the data objects identified in rows 425 a and 425 d , respectively, of the secondary copy data structure 122 .
  • the system adds entries to the secondary copy data structure 122 for the stubs.
  • rows 425 f and 425 g correspond to the entries for the stubs.
  • the system determines which data objects in the primary copy data have been deleted.
  • the system may use various techniques to determine which data objects in the primary copy data have been deleted. For example, the system may scan or analyze the primary copy data on a periodic or ad-hoc basis, and populate a data structure that contains entries for each of the data objects in the primary copy data.
  • FIG. 4C is a data structure diagram illustrating the primary copy data structure 126 .
  • the primary copy data structure 126 (alternatively referred to as an “image map”) is generally similar to the secondary copy data structure 122 but contains entries only for data objects existing in the primary copy data as of the most recent scan or analysis of the primary copy data.
  • the system can compare the secondary copy data structure 122 with the primary copy data structure 126 .
  • the data objects that are in the secondary copy data structure 122 but not in the primary copy data structure 126 are the data objects that have been deleted.
  • the system can use other techniques to determine when a data object in the primary copy data has been deleted, such as by receiving information from a driver or file system filter on the client 130 that detects such deletions. Additionally or alternatively, the system can predict if and when a data object in primary copy data has been deleted based upon information available to the system, such as heuristics or historical data.
  • step 245 the system updates the entries in the secondary copy data structure 122 corresponding to the deleted data objects to include their deletion times.
  • the system may use the time of the last scan or analysis as the deletion times or may use the actual deletion times of the data objects. After step 245 , the process 200 concludes.
  • the process 200 may be varied while still coming within the general scope of the process 200 .
  • the system may not archive the data object in the primary copy data.
  • the system may create a secondary copy of the data object and add an entry to the secondary copy data structure 122 before archiving the data object.
  • the system may flag the data object for later archiving after the system has created a secondary copy of the data object at a later time.
  • the system may perform other variations of the process 200 .
  • FIG. 3 is a flow diagram illustrating a process 300 implemented by the system in connection with reclaiming space used to store secondary copies in some examples (alternatively referred to as “pruning data”).
  • the process 300 begins at step 305 where the system accesses the secondary copy data structure 122 .
  • the system begins iterating through each entry in the secondary copy data structure 122 .
  • the system determines whether the data object in the primary copy data identified in the entry has been deleted. If not, the process 300 continues to step 320 , where the system creates a second secondary copy of the data object from the first secondary copy, and may delete the first secondary copy either immediately or at a later time, e.g., at the conclusion of the process 300 .
  • the data object identified in row 425 a of the secondary copy data structure 122 because it has no deletion time, has not been deleted.
  • the system can create the second secondary copy of the data object on the same media as the first secondary copy or on different media (e.g., if the first secondary copy is stored on disk, the system can create the second secondary copy on another disk, on tape, and/or on a cloud storage service).
  • step 335 the system determines whether the deletion time of the data object is longer ago than a predetermined, configurable, period of time (e.g., longer than one year ago). For example, the data object identified in row 425 b , because it has a deletion time, has been deleted. If not (e.g., the data object was deleted less than a year ago), the process 300 continues to step 320 , described above. If the deletion time of the data object is longer ago than the predetermined period of time, the process 300 skips step 320 (skips the step of creating a second secondary copy of the data object).
  • a predetermined, configurable, period of time e.g., longer than one year ago
  • the system may delete the secondary copy of the long-deleted data object either immediately or at a later time, e.g., at the conclusion of the process 300 . For example, if the system is performing the process 300 on Sep. 30, 2010 and the predetermined period of time is 90 days, then the system would not create a second secondary copy of the data object identified in row 425 b because it was deleted on Jun. 25, 2010. However, the system would create a second secondary copy of the data object identified in row 425 e because it was deleted on Jul. 10, 2010, which is less than 90 days before Sep. 30, 2010.
  • the predetermined period of time acts as a timer that starts when a data object in primary copy data has been deleted (or when the system detects the deletion). After the timer has expired, the system no longer needs to store the secondary copy of the data object. Storing the secondary copy of the data object for a period of time past the deletion time of the data object in primary copy data allows the secondary copy of the data object to be retrieved or recalled if, for example, the data object needed to be recovered to satisfy an e-discovery or legal hold request.
  • the predetermined period of time can be set according to archival rules or storage policies (e.g., to comply with e-discovery or other requirements). The predetermined period may vary based on the type of data object.
  • certain types of data objects may have a longer predetermined period of time than other types of data (e.g., personal emails).
  • the system may determine the data type by content indexing the data objects or by accessing data classifications of the data objects.
  • the predetermined period of time allows for data objects to be recovered in the case of accidental or unintended deletion or in case data objects appear to have been deleted. For example, if a user accidentally or unintentionally deletes a data object in primary copy data, the user has until at least the expiration of the predetermined period of time to discover the accidental or unintended deletion and request that the deleted data object be recovered.
  • a volume containing a set of data objects becomes unmounted, upon scanning or analyzing the primary copy data, the system would determine that the data objects have been deleted and accordingly update the corresponding entries in the secondary copy data structure 122 . As long as the volume is remounted prior to the predetermined periods of time, the system will not delete the secondary copies of the data objects. When the volume is remounted, the system can recognize that the data objects are already tracked in the secondary copy data structure 122 and remove the deletion times from the corresponding entries in the secondary copy data structure 122 .
  • step 325 the system moves to the next entry in the secondary copy data structure 122 and performs the above steps with respect to the data object identified in the next entry.
  • the process 300 continues at step 330 , where the system generates a new secondary copy data structure 122 that includes entries corresponding to only the data objects for which the system created second secondary copies.
  • the new secondary copy data structure 122 also includes the locations of the second secondary copies of the data objects.
  • the system may also delete the old secondary copy data structure. After step 330 the process 300 concludes.
  • the process 300 may be varied while still coming within the general scope of the process 300 .
  • the system may instead delete certain first secondary copies of data objects, e.g., those data objects having a deletion time longer ago than a predetermined, configurable, period of time.
  • the system may delete rows from the existing secondary copy data structure 122 corresponding to the data objects having a deletion time longer ago than a predetermined, configurable, period of time, for which the system did not create second secondary copies.
  • the system may also update the secondary copy locations of the rows corresponding to the data objects for which the system did create second secondary copies.
  • the system may additionally or alternatively prune a secondary copy of a data object when other criteria are met, such as criteria relating to the creation time, modification time, size, file type, or other characteristics of the data object in the primary copy data.
  • the system may perform other variations of the process 300 .
  • One advantage of the techniques described herein is that the system can avoid creating additional secondary copies of data objects in primary copy data when archiving the data objects. Instead, the system can use the associations between the secondary copy data structure 122 and the stubs data structure 124 to point or refer stubs to the previously-created secondary copy of the data objects. Accordingly, the existence of the previously-created secondary copy of the data objects allows the system to forego creating another secondary copy of the data objects when archiving the data objects, thereby saving resources. Since the system only transfers a data object from primary storage to secondary storage once instead of twice (e.g., once for backup, once for archive), it may save network bandwidth and processing capacity.
  • the system may avoid a single, large data transfer when it later archives the same set of data objects. Instead, the set of data objects in primary storage may simply be replaced with stubs when the time comes to archive them. As another example, since the system only stores a single copy of each data object in secondary storage, instead of two copies, the total secondary storage capacity needed by the system may be reduced.
  • Yet another advantage of the techniques described herein is that the system can use a common set of data structures to track both archive operations and other secondary copy operations, thereby potentially simplifying the tracking of both types of operations.
  • Another advantage is that since only one secondary copy of a data object needs to be created, other ancillary processes such as content-indexing, encryption, compression, data classification and/or deduplication or single-instancing of the secondary copy need only be performed once on the single secondary copy, instead of multiple times on each secondary copy.
  • the secondary copy data structure 122 can be updated to account for moved or transferred secondary copies (e.g., data objects moved to another tier of secondary storage). Accordingly, the stub of a data object whose secondary copy was moved or transferred can still be used to locate and recall the moved or transferred data object.
  • moved or transferred secondary copies e.g., data objects moved to another tier of secondary storage. Accordingly, the stub of a data object whose secondary copy was moved or transferred can still be used to locate and recall the moved or transferred data object.
  • Still another advantage of the techniques described herein is that by pruning data, e.g., in response to the deletion of corresponding primary data, the secondary storage capacity requirements are reduced.
  • FIG. 5 illustrates an example of one arrangement of resources in a computing network, comprising a data storage system 500 .
  • the resources in the data storage system 500 may employ the processes and techniques described herein.
  • the system 500 includes a storage manager 105 , one or more data agents 195 , one or more secondary storage computing devices 165 , one or more storage devices 115 , one or more computing devices 130 (called clients 130 ), one or more data or information stores 160 and 162 , a single instancing database 123 , an index 111 , a jobs agent 120 , an interface agent 125 , and a management agent 131 .
  • the system 500 may represent a modular storage system such as the CommVault QiNetix system, and also the CommVault GALAXY backup system, available from CommVault Systems, Inc. of Oceanport, N.J., aspects of which are further described in the commonly-assigned U.S. patent application Ser. No. 09/610,738, now U.S. Pat. No. 7,035,880, the entirety of which is incorporated by reference herein.
  • the system 500 may also represent a modular storage system such as the CommVault Simpana system, also available from CommVault Systems, Inc.
  • the system 500 may generally include combinations of hardware and software components associated with performing storage operations on electronic data. Storage operations include copying, backing up, creating, storing, retrieving, and/or migrating primary storage data (e.g., data stores 160 and/or 162 ) and secondary storage data (which may include, for example, snapshot copies, backup copies, hierarchical storage management (HSM) copies, archive copies, and other types of copies of electronic data stored on storage devices 115 ).
  • the system 500 may provide one or more integrated management consoles for users or system processes to interface with in order to perform certain storage operations on electronic data as further described herein. Such integrated management consoles may be displayed at a central control system or several similar consoles distributed throughout multiple network locations to provide global or geographically specific network data storage information.
  • storage operations may be performed according to various storage preferences, for example, as expressed by a user preference, a storage policy, a schedule policy, and/or a retention policy.
  • a “storage policy” is generally a data structure or other information source that includes a set of preferences and other storage criteria associated with performing a storage operation.
  • the preferences and storage criteria may include, but are not limited to, a storage location, relationships between system components, network pathways to utilize in a storage operation, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, a deduplication, single instancing or variable instancing policy to apply to the data, and/or other criteria relating to a storage operation.
  • a storage policy may indicate that certain data is to be stored in the storage device 115 , retained for a specified period of time before being aged to another tier of secondary storage, copied to the storage device 115 using a specified number of data streams, etc.
  • a “schedule policy” may specify a frequency with which to perform storage operations and a window of time within which to perform them. For example, a schedule policy may specify that a storage operation is to be performed every Saturday morning from 2:00 a.m. to 4:00 a.m. In some cases, the storage policy includes information generally specified by the schedule policy. (Put another way, the storage policy includes the schedule policy.)
  • a “retention policy” may specify how long data is to be retained at specific tiers of storage or what criteria must be met before data may be pruned or moved from one tier of storage to another tier of storage. Storage policies, schedule policies and/or retention policies may be stored in a database of the storage manager 105 , to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system 500 .
  • the system 500 may comprise a storage operation cell that is one of multiple storage operation cells arranged in a hierarchy or other organization.
  • Storage operation cells may be related to backup cells and provide some or all of the functionality of backup cells as described in the assignee's U.S. patent application Ser. No. 09/354,058, now U.S. Pat. No. 7,395,282, which is incorporated herein by reference in its entirety.
  • storage operation cells may also perform additional types of storage operations and other types of storage management functions that are not generally offered by backup cells.
  • Storage operation cells may contain not only physical devices, but also may represent logical concepts, organizations, and hierarchies.
  • a first storage operation cell may be configured to perform a first type of storage operations such as HSM operations, which may include backup or other types of data migration, and may include a variety of physical components including a storage manager 105 (or management agent 131 ), a secondary storage computing device 165 , a client 130 , and other components as described herein.
  • a second storage operation cell may contain the same or similar physical components; however, it may be configured to perform a second type of storage operations, such as storage resource management (SRM) operations, and may include monitoring a primary data copy or performing other known SRM operations.
  • SRM storage resource management
  • each storage operation cell may contain the same or similar physical devices.
  • different storage operation cells may contain some of the same physical devices and not others.
  • a storage operation cell configured to perform SRM tasks may contain a secondary storage computing device 165 , client 130 , or other network device connected to a primary storage volume
  • a storage operation cell configured to perform HSM tasks may instead include a secondary storage computing device 165 , client 130 , or other network device connected to a secondary storage volume and not contain the elements or components associated with and including the primary storage volume.
  • connection does not necessarily require a physical connection; rather, it could refer to two devices that are operably coupled to each other, communicably coupled to each other, in communication with each other, or more generally, refer to the capability of two devices to communicate with each other.
  • These two storage operation cells may each include a different storage manager 105 that coordinates storage operations via the same secondary storage computing devices 165 and storage devices 115 .
  • This “overlapping” configuration allows storage resources to be accessed by more than one storage manager 105 , such that multiple paths exist to each storage device 115 facilitating failover, load balancing, and promoting robust data access via alternative routes.
  • the same storage manager 105 may control two or more storage operation cells (whether or not each storage operation cell has its own dedicated storage manager 105 ).
  • the extent or type of overlap may be user-defined (through a control console) or may be automatically configured to optimize data storage and/or retrieval.
  • Data agent 195 may be a software module or part of a software module that is generally responsible for performing storage operations on the data of the client 130 stored in data store 160 / 162 or other memory location. Each client 130 may have at least one data agent 195 and the system 500 can support multiple clients 130 . Data agent 195 may be distributed between client 130 and storage manager 105 (and any other intermediate components), or it may be deployed from a remote location or its functions approximated by a remote process that performs some or all of the functions of data agent 195 .
  • the overall system 500 may employ multiple data agents 195 , each of which may perform storage operations on data associated with a different application.
  • different individual data agents 195 may be designed to handle Microsoft Exchange data, UNIX data, Lotus Notes data, Microsoft Windows file system data, Microsoft Active Directory Objects data, and other types of data known in the art.
  • Other embodiments may employ one or more generic data agents 195 that can handle and process multiple data types rather than using the specialized data agents described above.
  • one data agent 195 may be required for each data type to perform storage operations on the data of the client 130 .
  • the client 130 may use one Microsoft Exchange Mailbox data agent 195 to back up the Exchange mailboxes, one Microsoft Exchange 2000 Database data agent 195 to back up the Exchange databases, one Microsoft Exchange 2000 Public Folder data agent 195 to back up the Exchange 2000 Public Folders, and one Microsoft Windows File System data agent 195 to back up the file system of the client 130 .
  • These data agents 195 would be treated as four separate data agents 195 by the system even though they reside on the same client 130 .
  • the overall system 500 may use one or more generic data agents 195 , each of which may be capable of handling two or more data types.
  • one generic data agent 195 may be used to back up, migrate and restore Microsoft Exchange 2000 Mailbox data and Microsoft Exchange Database data while another generic data agent 195 may handle Microsoft Exchange Public Folder data and Microsoft Windows File System data, etc.
  • Data agents 195 may be responsible for arranging or packing data to be copied or migrated into a certain format such as an archive file. Nonetheless, it will be understood that this represents only one example, and any suitable packing or containerization technique or transfer methodology may be used if desired.
  • Such an archive file may include metadata, a list of files or data objects copied, the file, and data objects themselves.
  • any data moved by the data agents may be tracked within the system by updating indexes associated with appropriate storage managers 105 or secondary storage computing devices 165 .
  • a file or a data object refers to any collection or grouping of bytes of data that can be viewed as one or more logical units.
  • storage manager 105 may be a software module or other application that coordinates and controls storage operations performed by the system 500 .
  • Storage manager 105 may communicate with some or all elements of the system 500 , including clients 130 , data agents 195 , secondary storage computing devices 165 , and storage devices 115 , to initiate and manage storage operations (e.g., backups, migrations, data recovery operations, etc.).
  • Storage manager 105 may include a jobs agent 120 that monitors the status of some or all storage operations previously performed, currently being performed, or scheduled to be performed by the system 500 .
  • jobs agent 120 may be communicatively coupled to an interface agent 125 (e.g., a software module or application).
  • interface agent 125 may include information processing and display software, such as a graphical user interface (“GUI”), an application programming interface (“API”), or other interactive interface through which users and system processes can retrieve information about the status of storage operations.
  • GUI graphical user interface
  • API application programming interface
  • users may optionally issue instructions to various storage operation cells regarding performance of the storage operations as described and contemplated herein. For example, a user may modify a schedule concerning the number of pending snapshot copies or other types of copies scheduled as needed to suit particular needs or requirements. As another example, a user may employ the GUI to view the status of pending storage operations in some or all of the storage operation cells in a given network or to monitor the status of certain components in a particular storage operation cell (e.g., the amount of storage capacity left in a particular storage device 115 ).
  • Storage manager 105 may also include a management agent 131 that is typically implemented as a software module or application program.
  • management agent 131 provides an interface that allows various management agents 131 in other storage operation cells to communicate with one another. For example, assume a certain network configuration includes multiple storage operation cells hierarchically arranged or otherwise logically related in a WAN or LAN configuration. With this arrangement, each storage operation cell may be connected to the other through each respective interface agent 125 . This allows each storage operation cell to send and receive certain pertinent information from other storage operation cells, including status information, routing information, information regarding capacity and utilization, etc. These communications paths may also be used to convey information and instructions regarding storage operations.
  • a management agent 131 in a first storage operation cell may communicate with a management agent 131 in a second storage operation cell regarding the status of storage operations in the second storage operation cell.
  • Another illustrative example includes the case where a management agent 131 in a first storage operation cell communicates with a management agent 131 in a second storage operation cell to control storage manager 105 (and other components) of the second storage operation cell via management agent 131 contained in storage manager 105 .
  • management agent 131 in a first storage operation cell communicates directly with and controls the components in a second storage operation cell and bypasses the storage manager 105 in the second storage operation cell.
  • storage operation cells can also be organized hierarchically such that hierarchically superior cells control or pass information to hierarchically subordinate cells or vice versa.
  • Storage manager 105 may also maintain an index, a database, or other data structure 111 .
  • the data stored in database 111 may be used to indicate logical associations between components of the system, user preferences, management tasks, media containerization and data storage information or other useful data.
  • the storage manager 105 may use data from database 111 to track logical associations between secondary storage computing device 165 and storage devices 115 (or movement of data as containerized from primary to secondary storage).
  • the secondary storage computing device 165 which may also be referred to as a media agent, may be implemented as a software module that conveys data, as directed by storage manager 105 , between a client 130 and one or more storage devices 115 such as a tape library, a magnetic media storage device, an optical media storage device, or any other suitable storage device.
  • secondary storage computing device 165 may be communicatively coupled to and control a storage device 115 .
  • a secondary storage computing device 165 may be considered to be associated with a particular storage device 115 if that secondary storage computing device 165 is capable of routing and storing data to that particular storage device 115 .
  • a secondary storage computing device 165 associated with a particular storage device 115 may instruct the storage device to use a robotic arm or other retrieval means to load or eject a certain storage media, and to subsequently archive, migrate, or restore data to or from that media.
  • Secondary storage computing device 165 may communicate with a storage device 115 via a suitable communications path such as a SCSI or Fibre Channel communications link.
  • the storage device 115 may be communicatively coupled to the storage manager 105 via a SAN.
  • Each secondary storage computing device 165 may maintain an index, a database, or other data structure 161 that may store index data generated during storage operations for secondary storage (SS) as described herein, including creating a metabase (MB). For example, performing storage operations on Microsoft Exchange data may generate index data. Such index data provides a secondary storage computing device 165 or other external device with a fast and efficient mechanism for locating data stored or backed up.
  • SS secondary storage
  • MB metabase
  • index data provides a secondary storage computing device 165 or other external device with a fast and efficient mechanism for locating data stored or backed up.
  • a secondary storage computing device index 161 may store data associating a client 130 with a particular secondary storage computing device 165 or storage device 115 , for example, as specified in a storage policy, while a database or other data structure in secondary storage computing device 165 may indicate where specifically the data of the client 130 is stored in storage device 115 , what specific files were stored, and other information associated with storage of the data of the client 130 .
  • index data may be stored along with the data backed up in a storage device 115 , with an additional copy of the index data written to index cache in a secondary storage device.
  • the data is readily available for use in storage operations and other activities without having to be first retrieved from the storage device 115 .
  • information stored in cache is typically recent information that reflects certain particulars about operations that have recently occurred. After a certain period of time, this information is sent to secondary storage and tracked. This information may need to be retrieved and uploaded back into a cache or other memory in a secondary computing device before data can be retrieved from storage device 115 .
  • the cached information may include information regarding format or containerization of archives or other files stored on storage device 115 .
  • One or more of the secondary storage computing devices 165 may also maintain one or more single instance databases 123 .
  • Single instancing (alternatively called data deduplication) generally refers to storing in secondary storage only a single instance of each data object (or data block) in a set of data (e.g., primary data). More details as to single instancing may be found in one or more of the following commonly-assigned U.S. patent applications: 1) U.S. patent application Ser. No. 11/269,512 (entitled SYSTEM AND METHOD TO SUPPORT SINGLE INSTANCE STORAGE OPERATIONS; 2) U.S. patent application Ser. No.
  • variable instancing generally refers to storing in secondary storage one or more instances, but fewer than the total number of instances, of each data block (or data object) in a set of data (e.g., primary data). More details as to variable instancing may be found in the commonly-assigned U.S. Pat. App. No. 61/164,803 (entitled STORING A VARIABLE NUMBER OF INSTANCES OF DATA OBJECTS.
  • a client 130 such as a data agent 195 , or a storage manager 105 , coordinates and directs local archiving, migration, and retrieval application functions as further described in the previously-referenced U.S. patent application Ser. No. 09/610,738.
  • This client 130 can function independently or together with other similar clients 130 .
  • each secondary storage computing devices 165 has its own associated metabase 161 .
  • Each client 130 may also have its own associated metabase 170 .
  • each “tier” of storage such as primary storage, secondary storage, tertiary storage, etc., may have multiple metabases or a centralized metabase, as described herein.
  • the metabases on this storage tier may be centralized.
  • second and other tiers of storage may have either centralized or distributed metabases.
  • mixed architecture systems may be used if desired, that may include a first tier centralized metabase system coupled to a second tier storage system having distributed metabases and vice versa, etc.
  • a storage manager 105 or other management module may keep track of certain information that allows the storage manager 105 to select, designate, or otherwise identify metabases to be searched in response to certain queries as further described herein. Movement of data between primary and secondary storage may also involve movement of associated metadata and other tracking information as further described herein.
  • primary data may be organized into one or more sub-clients.
  • a sub-client is a portion of the data of one or more clients 130 , and can contain either all of the data of the clients 130 or a designated subset thereof.
  • the data store 162 includes two sub-clients.
  • an administrator or other user with the appropriate permissions; the term administrator is used herein for brevity
  • Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein.
  • Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, smart phones, and other devices suitable for the purposes described herein.
  • Modules described herein may be executed by a general-purpose computer, e.g., a server computer, wireless device, or personal computer.
  • aspects of the invention can be practiced with other communications, data processing, or computer system configurations, including: Internet appliances, hand-held devices (including personal digital assistants (PDAs)), wearable computers, all manner of cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers, and the like.
  • PDAs personal digital assistants
  • the terms “computer,” “server,” “host,” “host system,” and the like are generally used interchangeably herein and refer to any of the above devices and systems, as well as any data processor.
  • aspects of the invention can be embodied in a special purpose computer or data processor that is specifically programmed, configured, or constructed to perform one or more of the computer-executable instructions explained in detail herein.
  • Software and other modules may be accessible via local memory, a network, a browser, or other application in an ASP context, or via another means suitable for the purposes described herein. Examples of the technology can also be practiced in distributed computing environments where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (LAN), Wide Area Network (WAN), or the Internet. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
  • Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein.
  • User interface elements described herein may comprise elements from graphical user interfaces, command line interfaces, and other interfaces suitable for the purposes described herein.
  • Examples of the technology may be stored or distributed on computer-readable media, including magnetically or optically readable computer disks, hard-wired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media.
  • computer-implemented instructions, data structures, screen displays, and other data under aspects of the invention may be distributed over the Internet or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme).
  • the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.”
  • the terms “connected,” “coupled,” or any variant thereof means any connection or coupling, either direct or indirect, between two or more elements; the coupling or connection between the elements can be physical, logical, or a combination thereof.
  • the words “herein,” “above,” “below,” and words of similar import when used in this application, refer to this application as a whole and not to any particular portions of this application.
  • words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively.
  • the word “or,” in reference to a list of two or more items, covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A system for archiving data objects using secondary copies is disclosed. The system creates one or more secondary copies of primary data that contains multiple data objects. The system may maintain a first data structure that tracks the data objects for which the system has created secondary copies and the locations of the secondary copies. To archive data objects in the primary data, the system identifies data objects to be archived, verifies that previously-created secondary copies of the identified data objects exist, and replaces the identified data objects with stubs. The system may maintain a second data structure that both tracks the stubs and refers to the first data structure, thereby creating an association between the stubs and the locations of the secondary copies.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation of U.S. patent application Ser. No. 15/476,613, filed Mar. 31, 2017, which is a continuation of U.S. patent application Ser. No. 15/013,138, filed Feb. 2, 2016, issued as U.S. Pat. No. 9,639,563, which is a continuation of U.S. patent application Ser. No. 14/595,984, filed Jan. 13, 2015, issued as U.S. Pat. No. 9,262,275, which is a continuation of U.S. patent application Ser. No. 13/250,824, filed Sep. 30, 2011, issued as U.S. Pat. No. 8,935,492, which claims the benefit of U.S. Patent Application No. 61/388,566, filed Sep. 30, 2010, each of which is hereby incorporated herein by reference in its entirety.
BACKGROUND
A primary copy of data is generally a production copy or other “live” version of the data which is used by a software application and is generally in the native format of that application. Primary copy data may be maintained in a local memory or other high-speed storage device that allows for relatively fast data access if necessary. Such primary copy data is typically intended for short term retention (e.g., several hours or days) before some or all of the data is stored as one or more secondary copies, for example, to prevent loss of data in the event a problem occurred with the data stored in primary storage.
To protect primary copy data or for other purposes, such as regulatory compliance, secondary copies (alternatively referred to as “data protection copies”) can be made. Examples of secondary copies include a backup copy, a snapshot copy, a hierarchical storage management (“HSM”) copy, an archive copy, and other types of copies.
A backup copy is generally a point-in-time copy of the primary copy data stored in a backup format as opposed to in native application format. For example, a backup copy may be stored in a backup format that is optimized for compression and efficient long-term storage. Backup copies generally have relatively long retention periods and may be stored on media with slower retrieval times than other types of secondary copies and media. In some cases, backup copies may be stored at an offsite location.
After an initial, full backup of a data set is performed, periodic, intermittent, or continuous incremental backup operations may be subsequently performed on the data set. Each incremental backup operation copies only the primary copy data that has changed since the last full or incremental backup of the data set was performed. In this way, even if the entire set of primary copy data that is backed up is large, the amount of data that must be transferred during each incremental backup operation may be significantly smaller, since only the changed data needs to be transferred to secondary storage. Combined, one or more full backup and subsequent incremental copies may be utilized together to periodically or intermittently create a synthetic full backup copy. More details regarding synthetic storage operations are found in commonly-assigned U.S. patent application Ser. No. 12/510,059, entitled “Snapshot Storage and Management System with Indexing and User Interface,” filed Jul. 27, 2009, now U.S. Pat. No. 7,873,806, which is hereby incorporated herein in its entirety.
An archive copy is generally a copy of the primary copy data, but typically includes only a subset of the primary copy data that meets certain criteria and is usually stored in a format other than the native application format. For example, an archive copy might include only that data from the primary copy that is larger than a given size threshold or older than a given age threshold and that is stored in a backup format. Often, archive data is removed from the primary copy, and a stub is stored in the primary copy to indicate its new location. When a user requests access to the archive data that has been removed or migrated, systems use the stub to locate the data and often make recovery of the data appear transparent, even though the archive data may be stored at a location different from the remaining primary copy data.
Archive copies are typically created and tracked independently of other secondary copies, such as other backup copies. For example, to create a backup copy, the data storage system transfers a secondary copy of primary copy data to secondary storage and tracks the backup copy using a backup index separate from the archive index. To create an archive copy, a conventional data storage system transfers the primary copy data to be archived to secondary storage to create an archive copy, replaces the primary copy data with a stub, and tracks the archive copy using an archive index. Accordingly, the data storage system will transfer two separate times to secondary storage a primary copy data object that is both archived and backed-up.
Since each transfer consumes network and computing resources, the data storage system may not be able to devote such resources to other tasks. Moreover, the data storage system is required to devote resources to maintaining each separate index. In some cases, the archive index may be unaware of the other secondary copy and the other secondary index may be unaware of the archive copy, which may lead to further inefficiencies. Moreover, in some cases, in the event that an archive copy is moved or transferred (e.g., to another tier of secondary storage), the archive index may not be able to be updated to reflect the move or transfer. In such cases, the data storage system may be unable to use the stub to locate the archived data object.
Also, in conventional systems, archiving operations may require the transfer of large quantities of data during a single archive operation. For example, the retention criteria for an organization may specify that data objects more than two years old should be archived. On the first day of the organization's operation, it may be entirely unnecessary to archive any data, since the only data that exists at that point is newly created and thus ineligible for archiving. However, over the course of two years of operations, the organization may amass large quantities of data. Thus, when the first archive operation finally occurs, e.g., approximately two years into the operation of the organization, it may be necessary to transfer a large amount of the organization's data.
Additionally, backup, archive, and other secondary storage operations may unnecessarily preserve secondary copies of data created from primary data that has been deleted or is otherwise no longer being actively used as production data by a computing system, such as a workstation or server. Thus, secondary storage requirements may increasingly and unnecessarily bloat over time.
The need exists for systems and methods that overcome the above problems, as well as systems and methods that provide additional benefits. Overall, the examples herein of some prior or related systems and methods and their associated limitations are intended to be illustrative and not exclusive. Other limitations of existing or prior systems and methods will become apparent to those of skill in the art upon reading the following Detailed Description.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating an environment in which a system for archiving data objects using secondary copies operates.
FIG. 2 is a flow diagram illustrating a process implemented by the system in connection with archiving data objects using secondary copies.
FIG. 3 is a flow diagram illustrating a process implemented by the system in connection with reclaiming space used to store secondary copies.
FIGS. 4A-4C are data structure diagrams illustrating data structures used by the system.
FIG. 5 is a block diagram illustrating a data storage system in which the system operates.
DETAILED DESCRIPTION
The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the disclosure.
Overview
A software, firmware, and/or hardware system for archiving data objects using secondary copies (the “system”) is disclosed. The system creates one or more secondary copies of primary copy data (e.g., production data stored by a production computing system). The primary copy data contains multiple data objects (e.g., multiple files, emails, or other logical groupings or collections of data). The system maintains a first data structure that tracks the data objects for which the system has created secondary copies and the locations of the secondary copies.
To archive data objects in the primary copy data, the system applies rules to determine which data objects are to be archived. The system then verifies that previously-created secondary copies of data objects to be archived exist and replaces the data objects with stubs, pointers or logical addresses. The system maintains a second data structure that both tracks the stubs and refers to the first data structure, thereby creating an association between the stubs and the locations of the secondary copies. Notably, the system archives data objects without creating an additional or other secondary copy of the data objects. Instead, the association between the two data structures allows stubs to point to or refer to the previously-created secondary copy of the data objects. Accordingly, the existence of the previously-created secondary copy of the data objects allows the system to forego creating an additional or other secondary copy of the data objects, thereby saving resources.
The system may also perform a process to reclaim space used to store secondary copies. To do so, the system scans or analyzes the primary copy data to identify the data objects that exist in the primary copy data and stores the results of the scan or analysis in a third data structure. The system then compares the first and third data structures (e.g., the system performs a difference of the first and third data structures) to determine which data objects in the primary copy data have been deleted. For each deleted data object, the system updates the corresponding entry in the first data structure. Then the system accesses the first data structure and determines 1) which data objects in the primary copy data have not been deleted and 2) which have been deleted, but whose deletion occurred less than a predetermined period of time ago. For each data object determined in this fashion, the system then creates, from the first secondary copy of the data object, a second secondary copy of the data object. The system can then create a new first data structure or update the existing first data structure to reflect the second secondary copies of the data objects.
Various examples of the invention will now be described. The following description provides specific details for a thorough understanding and enabling description of these examples. One skilled in the relevant art will understand, however, that the invention may be practiced without many of these details. Likewise, one skilled in the relevant art will also understand that the invention may include many other obvious features not described in detail herein. Additionally, some well-known structures or functions may not be shown or described in detail below, so as to avoid unnecessarily obscuring the relevant description.
The terminology used below is to be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific examples of the invention. Indeed, certain terms may even be emphasized below; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section.
Illustrative Environment
FIG. 1 is a block diagram illustrating an environment 100 in which the system may operate. The environment 100 includes one or more clients 130, one or more primary data stores 160, a secondary storage computing device 165 (alternatively referred to as a “media agent”), and one or more storage devices 115. Each of the clients 130 is a computing device, examples of which are described herein. Clients may be, as non-exclusive examples, servers, workstations, personal computers, computerized tablets, PDAs, smart phones, or other computers having social networking data, such as a Facebook data. The clients 130 are each connected to one or more associated primary data stores 160 and to the secondary storage computing device 165. The secondary storage computing device 165 is connected to the storage device 115. The primary data stores 160 and storage device 115 may each be any type of storage suitable for storing data, such as Directly-Attached Storage (DAS) such as hard disks, a Storage Area Network (SAN), e.g., a Fibre Channel SAN, an iSCSI SAN or other type of SAN, Network-Attached Storage (NAS), a tape library, or any other type of storage. The clients 130 and the secondary storage computing device 165 typically include application software to perform desired operations and an operating system on which the application software runs. The clients 130 and the secondary storage computing device 165 typically also include a file system that facilitates and controls file access by the operating system and application software. The file system facilitates access to local and remote storage devices for file or data access and storage.
The clients 130, as part of their functioning, utilize data, which includes files, directories, metadata (e.g., ACLs, descriptive metadata, and any other streams associated with the data), and other data objects, which may be stored in the primary data store 160. The data of a client 130 is generally a primary copy (e.g., a production copy). Although described as a “client” of the secondary storage computing device 165, a client 130 may in fact be a production server, such as a file server or Exchange server, which provides live production data to multiple user workstations as part of its function. Each client 130 includes a data agent 195 (described in more detail with reference to FIG. 5). During a copy, backup, archive, or other storage operation, the data agents 195 send a copy of data objects in a primary data store 160 to the secondary storage computing device 165.
The secondary storage computing device 165 includes a memory 114. The memory 114 includes software 116 incorporating components 118 and data 119 typically used by the system. The components 118 include a secondary copy component 128 that performs secondary copy operations and a pruning component 129 that performs space reclamation or pruning operations. The data 119 includes secondary copy data structure 122, stubs data structure 124, and primary copy data structure 126. The system uses the data 119 to, among other things, track data objects copied during archive and other secondary copy operations and to track data objects in primary copy data.
While items 118 and 119 are illustrated as stored in memory 114, those skilled in the art will appreciate that these items, or portions of them, may be transferred between memory 114 and a persistent storage device 106 (for example, a magnetic hard drive, a tape of a tape library, etc.) for purposes of memory management, data integrity, and/or other purposes.
The secondary storage computing device 165 further includes one or more central processing units (CPU) 102 for executing software 116, and a computer-readable media drive 104 for reading information or installing software 116 from tangible computer-readable storage media, such as a floppy disk, a CD-ROM, a DVD, a USB flash drive, and/or other tangible computer-readable storage media. The secondary storage computing device 165 also includes one or more of the following: a network connection device 108 for connecting to a network, an information input device 110 (for example, a mouse, a keyboard, etc.), and an information output device 112 (for example, a display).
Illustrative Archiving Process and Data Structures
FIG. 2 is a flow diagram illustrating a process 200 implemented by the system in connection with archiving data objects using secondary copies in some examples. The process 200 begins at step 205, where the system creates a full secondary copy of the primary copy data of a client 130, by creating a secondary copy of the entire primary copy data and transferring the secondary copy to the storage device 115. The system may also create one or more incremental copies of the primary copy data by transferring only the primary copy data that has changed since the time of the full copy or a previous incremental copy. For example, the system may perform only a single full backup of all the primary copy data that is to be protected (as defined, for example, by a storage policy or other criteria) and store the full backup on the storage device 115. Thereafter, the system may then create weekly, daily, periodic, intermittent or continuous incremental backup copies of only the primary copy data that has changed since the system performed the last backup operation. In such examples, periodically the system may use one or more of the full backup, incremental backups, and/or previous synthetic full backups to generate a new synthetic full backup copy via a synthetic full operation. As part of a synthetic full backup operation, the system may process data objects that have been deleted from the primary copy of the data and remove these data objects from the synthetic full copy. In some examples, the generation of a new synthetic full backup copy or other synthetic full operation requires reading one or more previous backup copies or other types of secondary copies, rehydrating or decompressing the previous secondary copy or copies, and re-deduplicating the previous secondary copy or copies. In other examples, the generation of a new synthetic full backup copy or other synthetic operation does not require reading, rehydrating, or re-deduplicating a previous backup or other secondary copy. Instead, reference counts may be updated and metadata may be added to the synthetic full copy.
At step 210 the system adds entries to the secondary copy data structure 122. FIG. 4A is a data structure diagram illustrating the secondary copy data structure 122. The secondary copy data structure 122 contains rows, such as rows 425 a and 425 b, each divided into the following columns: an ID column 405 containing an identifier of a data object (e.g., a globally unique identifier—GUID), a primary copy location column 410 containing the location of the primary copy of the data object, a secondary copy location column 415 containing the location of the secondary copy of the data object, and a deletion time column 420 containing a time stamp of when the primary copy of the data object was deleted. The secondary copy data structure 122 may also include other columns that may contain additional data about data objects.
Although absolute locations for the primary copy and the secondary copy are shown in FIG. 4A, the system may additionally or alternatively use relative locations to indicate the locations of data objects in the secondary copy data structure 122. For example, the system may store secondary copies of data objects using a logical archive file and specify a relative location within the logical archive file for a secondary copy location. As another example, the system may store secondary copies of data objects on tape and specify a tape and an offset within the tape for a secondary copy location. Those of skill in the art will understand that secondary copies can be stored using varied techniques and that the system is not limited to the techniques expressly illustrated or described in this disclosure.
Moreover, although FIG. 4A illustrates entries corresponding to files in the secondary copy data structure 122, the disclosed techniques may also be used with other types of data objects, such as emails and email attachments, database or spreadsheet objects, data blocks, and other data objects stored in other data repositories. Accordingly, the disclosure is not to be construed as limited solely to files.
The system may utilize a single secondary copy data structure 122 for each client 130 (or subclient thereof) or for each set of data subject to data protection operations, which may be the data of a single client 130 or the data of multiple clients 130. Additionally or alternatively, the system may use a single secondary copy data structure 122 for multiple clients 130 or for multiple sets of data subject to data protection operations, which may be the data of a single client 130 or the data of multiple clients 130. In such a case, the secondary copy data structure 122 may contain additional columns containing data that allows for differentiation of data associated with different clients 130 or different sets of data.
In adding entries for each new copy of a data object, the system adds a new row 425 to the secondary copy data structure 122. The system may generate the identifier for each secondary copy of a data object created and, in the new row 425, add the identifier to column 405, add the primary copy location of the data object to column 410, and add the secondary copy location to column 415. The system may also store additional data as part of step 210, such as in other columns of the secondary copy data structure 122 or in other data structures.
Returning to FIG. 2, at step 215 the system identifies data objects in the primary copy data that are to be archived. For example, the system may apply one or more rules or criteria based on any combination of data object type, data object age, data object size, percentage of disk quota, remaining storage, metadata (e.g., a flag or tag indicating importance) and/or other factors. At step 220 the system verifies that a secondary copy of each data object has been made. To do so, the system may access the secondary copy data structure 122 to determine that secondary copies of the identified data objects exist. Also at step 220, the system obtains a token for each identified data object. The token represents confirmation or verification that a secondary copy of a data object was previously created, and is typically unique for each data object. At step 225, the system replaces each of the identified data objects in the primary copy data with a stub containing the token. The stub is typically a small data object that indicates, points to, or refers to the location of the secondary copy of the data object and facilitates recovery of the data object. More details as to archiving operations may be found in the commonly-assigned currently pending U.S. Patent Application Number 2008/0229037, the entirety of which is incorporated by reference herein.
At step 230 the system copies the stubs in the primary copy data to the storage device 115. At step 235 the system adds entries to the stubs data structure 124. FIG. 4B is a data structure diagram illustrating the stubs data structure 124. The stubs data structure 124 contains rows, such as rows 465 a and 465 b, each divided into the following columns: an ID column 455 containing the identifier of a data object (e.g., the GUID) and a token column 460 containing the token previously created or generated for the data object. The stubs data structure 124 may also include other columns that may contain additional data about data objects. The system may utilize a single stubs data structure 124 for a single data objects data structure 122, a single stubs data structure 124 for multiple data objects data structures 122, and/or multiple stubs data structures 124 for multiple data objects data structures 122.
In adding entries, the system adds a new row 465 to the stubs data structure 124. In the new row 465 the system adds the identifier that corresponds to the data object associated with the stub to column 455 and the token obtained in step 220 to column 460. The system may also store additional data as part of step 235, such as in other columns of the stubs data structure 124 or in other data structures. The entries in rows 465 a and 465 b indicate that the system archived the data objects identified in rows 425 a and 425 d, respectively, of the secondary copy data structure 122. Also in step 235 the system adds entries to the secondary copy data structure 122 for the stubs. In FIG. 4A, rows 425 f and 425 g correspond to the entries for the stubs.
Returning to FIG. 2, at step 240, the system determines which data objects in the primary copy data have been deleted. The system may use various techniques to determine which data objects in the primary copy data have been deleted. For example, the system may scan or analyze the primary copy data on a periodic or ad-hoc basis, and populate a data structure that contains entries for each of the data objects in the primary copy data. FIG. 4C is a data structure diagram illustrating the primary copy data structure 126. The primary copy data structure 126 (alternatively referred to as an “image map”) is generally similar to the secondary copy data structure 122 but contains entries only for data objects existing in the primary copy data as of the most recent scan or analysis of the primary copy data. To determine the data objects that have been deleted, the system can compare the secondary copy data structure 122 with the primary copy data structure 126. The data objects that are in the secondary copy data structure 122 but not in the primary copy data structure 126 are the data objects that have been deleted. Additionally or alternatively, the system can use other techniques to determine when a data object in the primary copy data has been deleted, such as by receiving information from a driver or file system filter on the client 130 that detects such deletions. Additionally or alternatively, the system can predict if and when a data object in primary copy data has been deleted based upon information available to the system, such as heuristics or historical data.
Returning to FIG. 2, at step 245 the system updates the entries in the secondary copy data structure 122 corresponding to the deleted data objects to include their deletion times. The system may use the time of the last scan or analysis as the deletion times or may use the actual deletion times of the data objects. After step 245, the process 200 concludes.
Those of skill in the art will understand that the process 200 may be varied while still coming within the general scope of the process 200. For example, if the system cannot verify that a secondary copy of the data object was previously created, the system may not archive the data object in the primary copy data. Alternatively, in such a case, the system may create a secondary copy of the data object and add an entry to the secondary copy data structure 122 before archiving the data object. Alternatively, the system may flag the data object for later archiving after the system has created a secondary copy of the data object at a later time. The system may perform other variations of the process 200.
Illustrative Space Reclamation Process
FIG. 3 is a flow diagram illustrating a process 300 implemented by the system in connection with reclaiming space used to store secondary copies in some examples (alternatively referred to as “pruning data”). The process 300 begins at step 305 where the system accesses the secondary copy data structure 122. At step 310, the system begins iterating through each entry in the secondary copy data structure 122. At step 315, the system determines whether the data object in the primary copy data identified in the entry has been deleted. If not, the process 300 continues to step 320, where the system creates a second secondary copy of the data object from the first secondary copy, and may delete the first secondary copy either immediately or at a later time, e.g., at the conclusion of the process 300. For example, the data object identified in row 425 a of the secondary copy data structure 122, because it has no deletion time, has not been deleted. The system can create the second secondary copy of the data object on the same media as the first secondary copy or on different media (e.g., if the first secondary copy is stored on disk, the system can create the second secondary copy on another disk, on tape, and/or on a cloud storage service).
If the system determines that the data object in the primary copy data has been deleted, the process 300 continues to step 335, where the system determines whether the deletion time of the data object is longer ago than a predetermined, configurable, period of time (e.g., longer than one year ago). For example, the data object identified in row 425 b, because it has a deletion time, has been deleted. If not (e.g., the data object was deleted less than a year ago), the process 300 continues to step 320, described above. If the deletion time of the data object is longer ago than the predetermined period of time, the process 300 skips step 320 (skips the step of creating a second secondary copy of the data object). Additionally, the system may delete the secondary copy of the long-deleted data object either immediately or at a later time, e.g., at the conclusion of the process 300. For example, if the system is performing the process 300 on Sep. 30, 2010 and the predetermined period of time is 90 days, then the system would not create a second secondary copy of the data object identified in row 425 b because it was deleted on Jun. 25, 2010. However, the system would create a second secondary copy of the data object identified in row 425 e because it was deleted on Jul. 10, 2010, which is less than 90 days before Sep. 30, 2010.
The predetermined period of time acts as a timer that starts when a data object in primary copy data has been deleted (or when the system detects the deletion). After the timer has expired, the system no longer needs to store the secondary copy of the data object. Storing the secondary copy of the data object for a period of time past the deletion time of the data object in primary copy data allows the secondary copy of the data object to be retrieved or recalled if, for example, the data object needed to be recovered to satisfy an e-discovery or legal hold request. The predetermined period of time can be set according to archival rules or storage policies (e.g., to comply with e-discovery or other requirements). The predetermined period may vary based on the type of data object. For example, certain types of data objects (e.g., financial data) may have a longer predetermined period of time than other types of data (e.g., personal emails). The system may determine the data type by content indexing the data objects or by accessing data classifications of the data objects.
Moreover, the predetermined period of time allows for data objects to be recovered in the case of accidental or unintended deletion or in case data objects appear to have been deleted. For example, if a user accidentally or unintentionally deletes a data object in primary copy data, the user has until at least the expiration of the predetermined period of time to discover the accidental or unintended deletion and request that the deleted data object be recovered. As another example, if a volume containing a set of data objects becomes unmounted, upon scanning or analyzing the primary copy data, the system would determine that the data objects have been deleted and accordingly update the corresponding entries in the secondary copy data structure 122. As long as the volume is remounted prior to the predetermined periods of time, the system will not delete the secondary copies of the data objects. When the volume is remounted, the system can recognize that the data objects are already tracked in the secondary copy data structure 122 and remove the deletion times from the corresponding entries in the secondary copy data structure 122.
At step 325 the system moves to the next entry in the secondary copy data structure 122 and performs the above steps with respect to the data object identified in the next entry. After the system has iterated through all of the entries in the secondary copy data structure 122, the process 300 continues at step 330, where the system generates a new secondary copy data structure 122 that includes entries corresponding to only the data objects for which the system created second secondary copies. The new secondary copy data structure 122 also includes the locations of the second secondary copies of the data objects. At step 330, the system may also delete the old secondary copy data structure. After step 330 the process 300 concludes.
Those of skill in the art will understand that the process 300 may be varied while still coming within the general scope of the process 300. For example, to prune data, instead of creating second secondary copies of data objects from the first secondary copies of data objects, the system may instead delete certain first secondary copies of data objects, e.g., those data objects having a deletion time longer ago than a predetermined, configurable, period of time. Instead of or in addition to creating a new secondary copy data structure 122, the system may delete rows from the existing secondary copy data structure 122 corresponding to the data objects having a deletion time longer ago than a predetermined, configurable, period of time, for which the system did not create second secondary copies. The system may also update the secondary copy locations of the rows corresponding to the data objects for which the system did create second secondary copies. As another example, instead of pruning a secondary copy of a data object in response to the deletion of the data object in the primary copy data, the system may additionally or alternatively prune a secondary copy of a data object when other criteria are met, such as criteria relating to the creation time, modification time, size, file type, or other characteristics of the data object in the primary copy data. The system may perform other variations of the process 300.
One advantage of the techniques described herein is that the system can avoid creating additional secondary copies of data objects in primary copy data when archiving the data objects. Instead, the system can use the associations between the secondary copy data structure 122 and the stubs data structure 124 to point or refer stubs to the previously-created secondary copy of the data objects. Accordingly, the existence of the previously-created secondary copy of the data objects allows the system to forego creating another secondary copy of the data objects when archiving the data objects, thereby saving resources. Since the system only transfers a data object from primary storage to secondary storage once instead of twice (e.g., once for backup, once for archive), it may save network bandwidth and processing capacity. Moreover, since the system often transfers a set of data objects from primary storage to secondary storage during the course of several incremental secondary copy operations (e.g., during several incremental backup operations), the system may avoid a single, large data transfer when it later archives the same set of data objects. Instead, the set of data objects in primary storage may simply be replaced with stubs when the time comes to archive them. As another example, since the system only stores a single copy of each data object in secondary storage, instead of two copies, the total secondary storage capacity needed by the system may be reduced.
Yet another advantage of the techniques described herein is that the system can use a common set of data structures to track both archive operations and other secondary copy operations, thereby potentially simplifying the tracking of both types of operations. Another advantage is that since only one secondary copy of a data object needs to be created, other ancillary processes such as content-indexing, encryption, compression, data classification and/or deduplication or single-instancing of the secondary copy need only be performed once on the single secondary copy, instead of multiple times on each secondary copy.
Another advantage of the techniques described herein is that the secondary copy data structure 122 can be updated to account for moved or transferred secondary copies (e.g., data objects moved to another tier of secondary storage). Accordingly, the stub of a data object whose secondary copy was moved or transferred can still be used to locate and recall the moved or transferred data object.
Still another advantage of the techniques described herein is that by pruning data, e.g., in response to the deletion of corresponding primary data, the secondary storage capacity requirements are reduced.
Suitable Data Storage System
FIG. 5 illustrates an example of one arrangement of resources in a computing network, comprising a data storage system 500. The resources in the data storage system 500 may employ the processes and techniques described herein. The system 500 includes a storage manager 105, one or more data agents 195, one or more secondary storage computing devices 165, one or more storage devices 115, one or more computing devices 130 (called clients 130), one or more data or information stores 160 and 162, a single instancing database 123, an index 111, a jobs agent 120, an interface agent 125, and a management agent 131. The system 500 may represent a modular storage system such as the CommVault QiNetix system, and also the CommVault GALAXY backup system, available from CommVault Systems, Inc. of Oceanport, N.J., aspects of which are further described in the commonly-assigned U.S. patent application Ser. No. 09/610,738, now U.S. Pat. No. 7,035,880, the entirety of which is incorporated by reference herein. The system 500 may also represent a modular storage system such as the CommVault Simpana system, also available from CommVault Systems, Inc.
The system 500 may generally include combinations of hardware and software components associated with performing storage operations on electronic data. Storage operations include copying, backing up, creating, storing, retrieving, and/or migrating primary storage data (e.g., data stores 160 and/or 162) and secondary storage data (which may include, for example, snapshot copies, backup copies, hierarchical storage management (HSM) copies, archive copies, and other types of copies of electronic data stored on storage devices 115). The system 500 may provide one or more integrated management consoles for users or system processes to interface with in order to perform certain storage operations on electronic data as further described herein. Such integrated management consoles may be displayed at a central control system or several similar consoles distributed throughout multiple network locations to provide global or geographically specific network data storage information.
In one example, storage operations may be performed according to various storage preferences, for example, as expressed by a user preference, a storage policy, a schedule policy, and/or a retention policy. A “storage policy” is generally a data structure or other information source that includes a set of preferences and other storage criteria associated with performing a storage operation. The preferences and storage criteria may include, but are not limited to, a storage location, relationships between system components, network pathways to utilize in a storage operation, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, a deduplication, single instancing or variable instancing policy to apply to the data, and/or other criteria relating to a storage operation. For example, a storage policy may indicate that certain data is to be stored in the storage device 115, retained for a specified period of time before being aged to another tier of secondary storage, copied to the storage device 115 using a specified number of data streams, etc.
A “schedule policy” may specify a frequency with which to perform storage operations and a window of time within which to perform them. For example, a schedule policy may specify that a storage operation is to be performed every Saturday morning from 2:00 a.m. to 4:00 a.m. In some cases, the storage policy includes information generally specified by the schedule policy. (Put another way, the storage policy includes the schedule policy.) A “retention policy” may specify how long data is to be retained at specific tiers of storage or what criteria must be met before data may be pruned or moved from one tier of storage to another tier of storage. Storage policies, schedule policies and/or retention policies may be stored in a database of the storage manager 105, to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system 500.
The system 500 may comprise a storage operation cell that is one of multiple storage operation cells arranged in a hierarchy or other organization. Storage operation cells may be related to backup cells and provide some or all of the functionality of backup cells as described in the assignee's U.S. patent application Ser. No. 09/354,058, now U.S. Pat. No. 7,395,282, which is incorporated herein by reference in its entirety. However, storage operation cells may also perform additional types of storage operations and other types of storage management functions that are not generally offered by backup cells.
Storage operation cells may contain not only physical devices, but also may represent logical concepts, organizations, and hierarchies. For example, a first storage operation cell may be configured to perform a first type of storage operations such as HSM operations, which may include backup or other types of data migration, and may include a variety of physical components including a storage manager 105 (or management agent 131), a secondary storage computing device 165, a client 130, and other components as described herein. A second storage operation cell may contain the same or similar physical components; however, it may be configured to perform a second type of storage operations, such as storage resource management (SRM) operations, and may include monitoring a primary data copy or performing other known SRM operations.
Thus, as can be seen from the above, although the first and second storage operation cells are logically distinct entities configured to perform different management functions (i.e., HSM and SRM, respectively), each storage operation cell may contain the same or similar physical devices. Alternatively, different storage operation cells may contain some of the same physical devices and not others. For example, a storage operation cell configured to perform SRM tasks may contain a secondary storage computing device 165, client 130, or other network device connected to a primary storage volume, while a storage operation cell configured to perform HSM tasks may instead include a secondary storage computing device 165, client 130, or other network device connected to a secondary storage volume and not contain the elements or components associated with and including the primary storage volume. (The term “connected” as used herein does not necessarily require a physical connection; rather, it could refer to two devices that are operably coupled to each other, communicably coupled to each other, in communication with each other, or more generally, refer to the capability of two devices to communicate with each other.) These two storage operation cells, however, may each include a different storage manager 105 that coordinates storage operations via the same secondary storage computing devices 165 and storage devices 115. This “overlapping” configuration allows storage resources to be accessed by more than one storage manager 105, such that multiple paths exist to each storage device 115 facilitating failover, load balancing, and promoting robust data access via alternative routes.
Alternatively or additionally, the same storage manager 105 may control two or more storage operation cells (whether or not each storage operation cell has its own dedicated storage manager 105). Moreover, in certain embodiments, the extent or type of overlap may be user-defined (through a control console) or may be automatically configured to optimize data storage and/or retrieval.
Data agent 195 may be a software module or part of a software module that is generally responsible for performing storage operations on the data of the client 130 stored in data store 160/162 or other memory location. Each client 130 may have at least one data agent 195 and the system 500 can support multiple clients 130. Data agent 195 may be distributed between client 130 and storage manager 105 (and any other intermediate components), or it may be deployed from a remote location or its functions approximated by a remote process that performs some or all of the functions of data agent 195.
The overall system 500 may employ multiple data agents 195, each of which may perform storage operations on data associated with a different application. For example, different individual data agents 195 may be designed to handle Microsoft Exchange data, UNIX data, Lotus Notes data, Microsoft Windows file system data, Microsoft Active Directory Objects data, and other types of data known in the art. Other embodiments may employ one or more generic data agents 195 that can handle and process multiple data types rather than using the specialized data agents described above.
If a client 130 has two or more types of data, one data agent 195 may be required for each data type to perform storage operations on the data of the client 130. For example, to back up, migrate, and restore all the data on a Microsoft Exchange server, the client 130 may use one Microsoft Exchange Mailbox data agent 195 to back up the Exchange mailboxes, one Microsoft Exchange 2000 Database data agent 195 to back up the Exchange databases, one Microsoft Exchange 2000 Public Folder data agent 195 to back up the Exchange 2000 Public Folders, and one Microsoft Windows File System data agent 195 to back up the file system of the client 130. These data agents 195 would be treated as four separate data agents 195 by the system even though they reside on the same client 130.
Alternatively, the overall system 500 may use one or more generic data agents 195, each of which may be capable of handling two or more data types. For example, one generic data agent 195 may be used to back up, migrate and restore Microsoft Exchange 2000 Mailbox data and Microsoft Exchange Database data while another generic data agent 195 may handle Microsoft Exchange Public Folder data and Microsoft Windows File System data, etc.
Data agents 195 may be responsible for arranging or packing data to be copied or migrated into a certain format such as an archive file. Nonetheless, it will be understood that this represents only one example, and any suitable packing or containerization technique or transfer methodology may be used if desired. Such an archive file may include metadata, a list of files or data objects copied, the file, and data objects themselves. Moreover, any data moved by the data agents may be tracked within the system by updating indexes associated with appropriate storage managers 105 or secondary storage computing devices 165. As used herein, a file or a data object refers to any collection or grouping of bytes of data that can be viewed as one or more logical units.
Generally speaking, storage manager 105 may be a software module or other application that coordinates and controls storage operations performed by the system 500. Storage manager 105 may communicate with some or all elements of the system 500, including clients 130, data agents 195, secondary storage computing devices 165, and storage devices 115, to initiate and manage storage operations (e.g., backups, migrations, data recovery operations, etc.).
Storage manager 105 may include a jobs agent 120 that monitors the status of some or all storage operations previously performed, currently being performed, or scheduled to be performed by the system 500. (One or more storage operations are alternatively referred to herein as a “job” or “jobs.”) Jobs agent 120 may be communicatively coupled to an interface agent 125 (e.g., a software module or application). Interface agent 125 may include information processing and display software, such as a graphical user interface (“GUI”), an application programming interface (“API”), or other interactive interface through which users and system processes can retrieve information about the status of storage operations. For example, in an arrangement of multiple storage operations cell, through interface agent 125, users may optionally issue instructions to various storage operation cells regarding performance of the storage operations as described and contemplated herein. For example, a user may modify a schedule concerning the number of pending snapshot copies or other types of copies scheduled as needed to suit particular needs or requirements. As another example, a user may employ the GUI to view the status of pending storage operations in some or all of the storage operation cells in a given network or to monitor the status of certain components in a particular storage operation cell (e.g., the amount of storage capacity left in a particular storage device 115).
Storage manager 105 may also include a management agent 131 that is typically implemented as a software module or application program. In general, management agent 131 provides an interface that allows various management agents 131 in other storage operation cells to communicate with one another. For example, assume a certain network configuration includes multiple storage operation cells hierarchically arranged or otherwise logically related in a WAN or LAN configuration. With this arrangement, each storage operation cell may be connected to the other through each respective interface agent 125. This allows each storage operation cell to send and receive certain pertinent information from other storage operation cells, including status information, routing information, information regarding capacity and utilization, etc. These communications paths may also be used to convey information and instructions regarding storage operations.
For example, a management agent 131 in a first storage operation cell may communicate with a management agent 131 in a second storage operation cell regarding the status of storage operations in the second storage operation cell. Another illustrative example includes the case where a management agent 131 in a first storage operation cell communicates with a management agent 131 in a second storage operation cell to control storage manager 105 (and other components) of the second storage operation cell via management agent 131 contained in storage manager 105.
Another illustrative example is the case where management agent 131 in a first storage operation cell communicates directly with and controls the components in a second storage operation cell and bypasses the storage manager 105 in the second storage operation cell. If desired, storage operation cells can also be organized hierarchically such that hierarchically superior cells control or pass information to hierarchically subordinate cells or vice versa.
Storage manager 105 may also maintain an index, a database, or other data structure 111. The data stored in database 111 may be used to indicate logical associations between components of the system, user preferences, management tasks, media containerization and data storage information or other useful data. For example, the storage manager 105 may use data from database 111 to track logical associations between secondary storage computing device 165 and storage devices 115 (or movement of data as containerized from primary to secondary storage).
Generally speaking, the secondary storage computing device 165, which may also be referred to as a media agent, may be implemented as a software module that conveys data, as directed by storage manager 105, between a client 130 and one or more storage devices 115 such as a tape library, a magnetic media storage device, an optical media storage device, or any other suitable storage device. In one embodiment, secondary storage computing device 165 may be communicatively coupled to and control a storage device 115. A secondary storage computing device 165 may be considered to be associated with a particular storage device 115 if that secondary storage computing device 165 is capable of routing and storing data to that particular storage device 115.
In operation, a secondary storage computing device 165 associated with a particular storage device 115 may instruct the storage device to use a robotic arm or other retrieval means to load or eject a certain storage media, and to subsequently archive, migrate, or restore data to or from that media. Secondary storage computing device 165 may communicate with a storage device 115 via a suitable communications path such as a SCSI or Fibre Channel communications link. In some embodiments, the storage device 115 may be communicatively coupled to the storage manager 105 via a SAN.
Each secondary storage computing device 165 may maintain an index, a database, or other data structure 161 that may store index data generated during storage operations for secondary storage (SS) as described herein, including creating a metabase (MB). For example, performing storage operations on Microsoft Exchange data may generate index data. Such index data provides a secondary storage computing device 165 or other external device with a fast and efficient mechanism for locating data stored or backed up. Thus, a secondary storage computing device index 161, or a database 111 of a storage manager 105, may store data associating a client 130 with a particular secondary storage computing device 165 or storage device 115, for example, as specified in a storage policy, while a database or other data structure in secondary storage computing device 165 may indicate where specifically the data of the client 130 is stored in storage device 115, what specific files were stored, and other information associated with storage of the data of the client 130. In some embodiments, such index data may be stored along with the data backed up in a storage device 115, with an additional copy of the index data written to index cache in a secondary storage device. Thus the data is readily available for use in storage operations and other activities without having to be first retrieved from the storage device 115.
Generally speaking, information stored in cache is typically recent information that reflects certain particulars about operations that have recently occurred. After a certain period of time, this information is sent to secondary storage and tracked. This information may need to be retrieved and uploaded back into a cache or other memory in a secondary computing device before data can be retrieved from storage device 115. In some embodiments, the cached information may include information regarding format or containerization of archives or other files stored on storage device 115.
One or more of the secondary storage computing devices 165 may also maintain one or more single instance databases 123. Single instancing (alternatively called data deduplication) generally refers to storing in secondary storage only a single instance of each data object (or data block) in a set of data (e.g., primary data). More details as to single instancing may be found in one or more of the following commonly-assigned U.S. patent applications: 1) U.S. patent application Ser. No. 11/269,512 (entitled SYSTEM AND METHOD TO SUPPORT SINGLE INSTANCE STORAGE OPERATIONS; 2) U.S. patent application Ser. No. 12/145,347 (entitled APPLICATION-AWARE AND REMOTE SINGLE INSTANCE DATA MANAGEMENT; or 3) U.S. patent application Ser. No. 12/145,342 (entitled APPLICATION-AWARE AND REMOTE SINGLE INSTANCE DATA MANAGEMENT, 4) U.S. patent application Ser. No. 11/963,623 (entitled SYSTEM AND METHOD FOR STORING REDUNDANT INFORMATION; 5) U.S. patent application Ser. No. 11/950,376 (entitled SYSTEMS AND METHODS FOR CREATING COPIES OF DATA SUCH AS ARCHIVE COPIES; or 6) U.S. Pat App. No. 61/100,686 (entitled SYSTEMS AND METHODS FOR MANAGING SINGLE INSTANCING DATA, each of which is incorporated by reference herein in its entirety.
In some examples, the secondary storage computing devices 165 maintain one or more variable instance databases. Variable instancing generally refers to storing in secondary storage one or more instances, but fewer than the total number of instances, of each data block (or data object) in a set of data (e.g., primary data). More details as to variable instancing may be found in the commonly-assigned U.S. Pat. App. No. 61/164,803 (entitled STORING A VARIABLE NUMBER OF INSTANCES OF DATA OBJECTS.
In some embodiments, certain components may reside and execute on the same computer. For example, in some embodiments, a client 130 such as a data agent 195, or a storage manager 105, coordinates and directs local archiving, migration, and retrieval application functions as further described in the previously-referenced U.S. patent application Ser. No. 09/610,738. This client 130 can function independently or together with other similar clients 130.
As shown in FIG. 5, each secondary storage computing devices 165 has its own associated metabase 161. Each client 130 may also have its own associated metabase 170. However in some embodiments, each “tier” of storage, such as primary storage, secondary storage, tertiary storage, etc., may have multiple metabases or a centralized metabase, as described herein. For example, rather than a separate metabase or index associated with each client 130 in FIG. 5, the metabases on this storage tier may be centralized. Similarly, second and other tiers of storage may have either centralized or distributed metabases. Moreover, mixed architecture systems may be used if desired, that may include a first tier centralized metabase system coupled to a second tier storage system having distributed metabases and vice versa, etc.
Moreover, in operation, a storage manager 105 or other management module may keep track of certain information that allows the storage manager 105 to select, designate, or otherwise identify metabases to be searched in response to certain queries as further described herein. Movement of data between primary and secondary storage may also involve movement of associated metadata and other tracking information as further described herein.
In some examples, primary data may be organized into one or more sub-clients. A sub-client is a portion of the data of one or more clients 130, and can contain either all of the data of the clients 130 or a designated subset thereof. As depicted in FIG. 5, the data store 162 includes two sub-clients. For example, an administrator (or other user with the appropriate permissions; the term administrator is used herein for brevity) may find it preferable to separate email data from financial data using two different sub-clients having different storage preferences, retention criteria, etc.
CONCLUSION
Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, smart phones, and other devices suitable for the purposes described herein. Modules described herein may be executed by a general-purpose computer, e.g., a server computer, wireless device, or personal computer. Those skilled in the relevant art will appreciate that aspects of the invention can be practiced with other communications, data processing, or computer system configurations, including: Internet appliances, hand-held devices (including personal digital assistants (PDAs)), wearable computers, all manner of cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers, and the like. Indeed, the terms “computer,” “server,” “host,” “host system,” and the like, are generally used interchangeably herein and refer to any of the above devices and systems, as well as any data processor. Furthermore, aspects of the invention can be embodied in a special purpose computer or data processor that is specifically programmed, configured, or constructed to perform one or more of the computer-executable instructions explained in detail herein.
Software and other modules may be accessible via local memory, a network, a browser, or other application in an ASP context, or via another means suitable for the purposes described herein. Examples of the technology can also be practiced in distributed computing environments where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (LAN), Wide Area Network (WAN), or the Internet. In a distributed computing environment, program modules may be located in both local and remote memory storage devices. Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein. User interface elements described herein may comprise elements from graphical user interfaces, command line interfaces, and other interfaces suitable for the purposes described herein.
Examples of the technology may be stored or distributed on computer-readable media, including magnetically or optically readable computer disks, hard-wired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media. Indeed, computer-implemented instructions, data structures, screen displays, and other data under aspects of the invention may be distributed over the Internet or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme).
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” As used herein, the terms “connected,” “coupled,” or any variant thereof, means any connection or coupling, either direct or indirect, between two or more elements; the coupling or connection between the elements can be physical, logical, or a combination thereof. Additionally, the words “herein,” “above,” “below,” and words of similar import, when used in this application, refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word “or,” in reference to a list of two or more items, covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.
The above Detailed Description is not intended to be exhaustive or to limit the invention to the precise form disclosed above. While specific examples for the invention are described above for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative implementations may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified to provide alternative or subcombinations. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed or implemented in parallel, or may be performed at different times. Further any specific numbers noted herein are only examples: alternative implementations may employ differing values or ranges.
The teachings of the invention provided herein can be applied to other systems, not necessarily the systems described herein. The elements and acts of the various examples described above can be combined to provide further implementations of the invention.
Any patents and applications and other references noted above, including any that may be listed in accompanying filing papers, are incorporated herein by reference. Aspects of the invention can be modified, if necessary, to employ the systems, functions, and concepts of the various references described above to provide yet further implementations of the invention.
These and other changes can be made to the invention in light of the above Detailed Description. While the above description describes certain examples of the invention and describes the best mode contemplated, no matter how detailed the above appears in text, the invention can be practiced in many ways. Details of the system may vary considerably in its specific implementation, while still being encompassed by the invention disclosed herein. As noted above, particular terminology used when describing certain features or aspects of the invention should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the invention with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the invention to the specific examples disclosed in the specification, unless the above Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the invention encompasses not only the disclosed examples, but also all equivalent ways of practicing or implementing the invention under the claims.
While certain examples are presented below in certain forms, the applicant contemplates the various aspects of the invention in any number of claim forms. Accordingly, the applicant reserves the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the invention.

Claims (22)

We claim:
1. A non-transitory computer-readable storage medium whose contents cause a data storage system to perform a method for archiving multiple primary data objects, the method comprising:
receiving, from a source computing device, both full and incremental backup copies of primary data associated with the source computing device;
creating a secondary copy of multiple data objects comprising the primary data by using the received full and incremental backup copies of the primary data;
for each of the multiple data objects for which a secondary copy was created, adding an entry for corresponding data object to a first data structure,
wherein the entry includes an identifier associated with the corresponding data object;
after creating the secondary copy, identifying one or more of the multiple data objects that satisfy one or more predetermined archival criteria,
wherein the one or more predetermined archival criteria are specified by a storage policy assigned to the primary data associated with the source computing device; and
for each identified data object of the identified one or more of the multiple data objects:
querying the first data structure using the identifier associated with the identified data object, to verify that a secondary copy of the identified data object exists in secondary storage;
replacing the identified data object in the primary data with a stub referencing the identified data object within the secondary copy of the multiple data objects,
wherein the secondary copy was created in association with a prior backup job; and
updating a stubs data structure with the identifier associated with the identified data object.
2. The computer-readable storage medium of claim 1, the method further comprising:
receiving a token for the identified data object, wherein the token represents a verification that the secondary copy was created.
3. The computer-readable storage medium of claim 2, the token is included in the stub.
4. The computer-readable storage medium of claim 1, wherein the one or more predetermined archival criteria comprises at least one of: a data object type, a data object age, a data object size, a percentage of disk quota, remaining storage, and metadata.
5. The computer-readable storage medium of claim 1, the method further comprising:
receiving information regarding a first data object included in the primary data from a driver or file system that detects deletions;
using the received information to determine that the first data object has been deleted from the primary data and a corresponding deletion time; and
in response to determining that the corresponding deletion time is more than a predetermined period of time ago, deleting the secondary copy of the first data object.
6. The computer-readable storage medium of claim 5, wherein the predetermined period of time is determined at least in part by an object type of the first data object.
7. The computer-readable storage medium of claim 5, wherein the predetermined period of time is determined by the storage policy assigned to the source computing device.
8. The computer-readable storage medium of claim 1, the method further comprising:
after replacing the identified data object in the primary data with the stub referencing the identified data object, performing at least one of following operations on the created secondary copy of the multiple data objects comprising the primary data: deduplication, decompression, compression, content-indexing, encryption, decryption, or data classification.
9. The computer-readable storage medium of claim 1,
wherein the stubs data structure further comprises information indicating where a secondary copy of the identified data object is stored.
10. A system for archiving data objects using secondary copies, the system comprising:
one or more computing devices configured to:
receive, from a source computing device, both full and incremental backup copies of primary data associated with the source computing device;
create a secondary copy of multiple data objects comprising the primary data by using the received full and incremental backup copies of the primary data;
for each of the multiple data objects for which a secondary copy was created, add an entry for the data object to a first data structure,
wherein the entry includes an identifier associated with the data object;
after the secondary copy is created, identify one or more of the multiple data objects that satisfy one or more predetermined archival criteria, wherein the one or more predetermined archival criteria is specified by a storage policy assigned to the primary data associated with the source computing device; and
for each identified data object of the identified one or more of the multiple data objects:
query the first data structure using the identifier associated with the identified data object, to verify that a secondary copy of the identified data object exists in secondary storage;
replace the identified data object in the primary data with a stub referencing the identified data object within the secondary copy of the multiple data objects, wherein the secondary copy was created in association with a prior backup job; and
updating a stubs data structure with the identifier associated with the identified data object.
11. The system of claim 10, wherein the one or more computing devices is further configured to:
receive a token for the identified data object, wherein the token represents a verification that the secondary copy was created.
12. The system of claim 11, wherein the token is included in the stub.
13. The system of claim 10, wherein the predetermined archival criteria comprises at least one of: a data object type, a data object age, a data object size, a percentage of disk quota, remaining storage, and metadata.
14. The system of claim 10, wherein the one or more computing devices is further configured to:
receive information regarding a first data object included in the primary data from a driver or file system that detects deletions;
using the received information, determine that the first data object has been deleted from the primary data and a corresponding deletion time; and
in response to determining that the corresponding deletion time is more than a predetermined period of time ago, delete the secondary copy of the first data object.
15. The system of claim 14, wherein the predetermined period of time is determined at least in part by an object type of the first data object.
16. The system of claim 14, wherein the predetermined period of time is determined by the storage policy assigned to the source computing device.
17. The system of claim 10, wherein the one or more computing devices is further configured to:
perform at least one of following operations on the created secondary copy of the multiple data objects comprising the primary data: deduplication, decompression, compression, content-indexing, encryption, decryption, or data classification.
18. The system of claim 10,
wherein the stubs data structure further comprises information indicating where the secondary copy of the identified data object is stored.
19. A system for archiving data objects using secondary copies, the system comprising:
at least one processor;
at least one memory coupled to the at least one processor;
a first software component configured to create one or more secondary copies of primary data that contains multiple data objects;
a first data structure stored on the at least one memory that contains a mapping between the multiple data objects for which secondary copies have been created and locations of the secondary copies;
a second data structure stored on the at least one memory that stores, for each data object for which a secondary copy had been created, a unique token; and
a second software component configured to:
identify data objects to be archived,
generate corresponding tokens for identified data objects to be archived,
verify that previously-created secondary copies of the identified data objects exist by confirming that corresponding tokens are present in the second data structure, and
replace the identified data objects with stubs.
20. The system of claim 19, wherein after a predetermined period of time set by a storage policy, the data objects to be archived are replaced with stubs, and wherein the predetermined period of time is determined at least in part by an object type of the data objects.
21. The system of claim 19, wherein the previously-created secondary copies of the identified data objects is associated with a backup copy created at least in part by an incremental or full backup operation.
22. The system of claim 19, wherein the second software component is further configured to:
determine that a first data object included in the primary data satisfies predetermined criteria; and
in response to determining that the first data object satisfies the predetermined criteria, delete the secondary copy of the first data object.
US16/934,432 2010-09-30 2020-07-21 Archiving data objects using secondary copies Active US11392538B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/934,432 US11392538B2 (en) 2010-09-30 2020-07-21 Archiving data objects using secondary copies
US17/841,575 US11768800B2 (en) 2010-09-30 2022-06-15 Archiving data objects using secondary copies

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US38856610P 2010-09-30 2010-09-30
US13/250,824 US8935492B2 (en) 2010-09-30 2011-09-30 Archiving data objects using secondary copies
US14/595,984 US9262275B2 (en) 2010-09-30 2015-01-13 Archiving data objects using secondary copies
US15/013,138 US9639563B2 (en) 2010-09-30 2016-02-02 Archiving data objects using secondary copies
US15/476,613 US10762036B2 (en) 2010-09-30 2017-03-31 Archiving data objects using secondary copies
US16/934,432 US11392538B2 (en) 2010-09-30 2020-07-21 Archiving data objects using secondary copies

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/476,613 Continuation US10762036B2 (en) 2010-09-30 2017-03-31 Archiving data objects using secondary copies

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/841,575 Continuation US11768800B2 (en) 2010-09-30 2022-06-15 Archiving data objects using secondary copies

Publications (2)

Publication Number Publication Date
US20200349107A1 US20200349107A1 (en) 2020-11-05
US11392538B2 true US11392538B2 (en) 2022-07-19

Family

ID=45890828

Family Applications (6)

Application Number Title Priority Date Filing Date
US13/250,824 Active 2033-05-21 US8935492B2 (en) 2010-09-30 2011-09-30 Archiving data objects using secondary copies
US14/595,984 Active US9262275B2 (en) 2010-09-30 2015-01-13 Archiving data objects using secondary copies
US15/013,138 Active US9639563B2 (en) 2010-09-30 2016-02-02 Archiving data objects using secondary copies
US15/476,613 Active US10762036B2 (en) 2010-09-30 2017-03-31 Archiving data objects using secondary copies
US16/934,432 Active US11392538B2 (en) 2010-09-30 2020-07-21 Archiving data objects using secondary copies
US17/841,575 Active US11768800B2 (en) 2010-09-30 2022-06-15 Archiving data objects using secondary copies

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US13/250,824 Active 2033-05-21 US8935492B2 (en) 2010-09-30 2011-09-30 Archiving data objects using secondary copies
US14/595,984 Active US9262275B2 (en) 2010-09-30 2015-01-13 Archiving data objects using secondary copies
US15/013,138 Active US9639563B2 (en) 2010-09-30 2016-02-02 Archiving data objects using secondary copies
US15/476,613 Active US10762036B2 (en) 2010-09-30 2017-03-31 Archiving data objects using secondary copies

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/841,575 Active US11768800B2 (en) 2010-09-30 2022-06-15 Archiving data objects using secondary copies

Country Status (2)

Country Link
US (6) US8935492B2 (en)
WO (1) WO2012045023A2 (en)

Families Citing this family (91)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7640746B2 (en) * 2005-05-27 2010-01-05 Markon Technologies, LLC Method and system integrating solar heat into a regenerative rankine steam cycle
WO2008070688A1 (en) 2006-12-04 2008-06-12 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
US7840537B2 (en) 2006-12-22 2010-11-23 Commvault Systems, Inc. System and method for storing redundant information
US8769048B2 (en) 2008-06-18 2014-07-01 Commvault Systems, Inc. Data protection scheduling, such as providing a flexible backup window in a data protection system
US9098495B2 (en) * 2008-06-24 2015-08-04 Commvault Systems, Inc. Application-aware and remote single instance data management
US8166263B2 (en) 2008-07-03 2012-04-24 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US8725688B2 (en) 2008-09-05 2014-05-13 Commvault Systems, Inc. Image level copy or restore, such as image level restore without knowledge of data object metadata
US9015181B2 (en) * 2008-09-26 2015-04-21 Commvault Systems, Inc. Systems and methods for managing single instancing data
EP2329378A4 (en) 2008-09-26 2015-11-25 Commvault Systems Inc Systems and methods for managing single instancing data
US8412677B2 (en) 2008-11-26 2013-04-02 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8578120B2 (en) 2009-05-22 2013-11-05 Commvault Systems, Inc. Block-level single instancing
US8935492B2 (en) 2010-09-30 2015-01-13 Commvault Systems, Inc. Archiving data objects using secondary copies
US9749132B1 (en) * 2011-11-28 2017-08-29 Amazon Technologies, Inc. System and method for secure deletion of data
US9020890B2 (en) 2012-03-30 2015-04-28 Commvault Systems, Inc. Smart archiving and data previewing for mobile devices
JP5881859B2 (en) 2012-04-13 2016-03-09 株式会社日立製作所 Storage device
US20130297576A1 (en) * 2012-05-03 2013-11-07 Microsoft Corporation Efficient in-place preservation of content across content sources
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US9122647B2 (en) * 2012-10-16 2015-09-01 Dell Products, L.P. System and method to backup objects on an object storage platform
US10042907B2 (en) * 2012-11-29 2018-08-07 Teradata Us, Inc. Providing metadata to database systems and environments with multiple processing units or modules
US9633216B2 (en) 2012-12-27 2017-04-25 Commvault Systems, Inc. Application of information management policies based on operation with a geographic entity
US9633022B2 (en) 2012-12-28 2017-04-25 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US9846620B2 (en) 2013-01-11 2017-12-19 Commvault Systems, Inc. Table level database restore in a data storage system
US9459968B2 (en) 2013-03-11 2016-10-04 Commvault Systems, Inc. Single index to query multiple backup formats
US10642795B2 (en) * 2013-04-30 2020-05-05 Oracle International Corporation System and method for efficiently duplicating data in a storage system, eliminating the need to read the source data or write the target data
US9110847B2 (en) * 2013-06-24 2015-08-18 Sap Se N to M host system copy
US20150172120A1 (en) * 2013-12-12 2015-06-18 Commvault Systems, Inc. Managing non-conforming entities in information management systems, including enforcing conformance with a model entity
US10324897B2 (en) 2014-01-27 2019-06-18 Commvault Systems, Inc. Techniques for serving archived electronic mail
US9798596B2 (en) 2014-02-27 2017-10-24 Commvault Systems, Inc. Automatic alert escalation for an information management system
US9648100B2 (en) * 2014-03-05 2017-05-09 Commvault Systems, Inc. Cross-system storage management for transferring data across autonomous information management systems
US9467514B2 (en) 2014-03-12 2016-10-11 Cameron International Corporation Token-based data management system and method for a network
US10169396B2 (en) * 2014-03-27 2019-01-01 Salesforce.Com, Inc. Maintaining data consistency between transactional and non-transactional data stores
US9823978B2 (en) * 2014-04-16 2017-11-21 Commvault Systems, Inc. User-level quota management of data objects stored in information management systems
US9740574B2 (en) 2014-05-09 2017-08-22 Commvault Systems, Inc. Load balancing across multiple data paths
US9607004B2 (en) * 2014-06-18 2017-03-28 International Business Machines Corporation Storage device data migration
WO2016007158A1 (en) * 2014-07-10 2016-01-14 Hewlett-Packard Development Company, L.P. Archive file
US20160019224A1 (en) * 2014-07-18 2016-01-21 Commvault Systems, Inc. File system content archiving based on third-party application archiving rules and metadata
US11249858B2 (en) 2014-08-06 2022-02-15 Commvault Systems, Inc. Point-in-time backups of a production application made accessible over fibre channel and/or ISCSI as data sources to a remote application by representing the backups as pseudo-disks operating apart from the production application and its host
US9852026B2 (en) 2014-08-06 2017-12-26 Commvault Systems, Inc. Efficient application recovery in an information management system based on a pseudo-storage-device driver
US10565159B2 (en) 2014-08-12 2020-02-18 International Business Machines Corporation Archiving data sets in a volume in a primary storage in a volume image copy of the volume in a secondary storage
US20160210306A1 (en) 2015-01-15 2016-07-21 Commvault Systems, Inc. Managing structured data in a data storage system
US10108687B2 (en) 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
US9928144B2 (en) * 2015-03-30 2018-03-27 Commvault Systems, Inc. Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage
US10445289B1 (en) * 2015-03-31 2019-10-15 EMC IP Holding Company LLC Method and apparatus for automatic cleanup of disfavored content
US9996429B1 (en) 2015-04-14 2018-06-12 EMC IP Holding Company LLC Mountable container backups for files
US10078555B1 (en) * 2015-04-14 2018-09-18 EMC IP Holding Company LLC Synthetic full backups for incremental file backups
US9946603B1 (en) 2015-04-14 2018-04-17 EMC IP Holding Company LLC Mountable container for incremental file backups
US9904598B2 (en) 2015-04-21 2018-02-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US10324914B2 (en) 2015-05-20 2019-06-18 Commvalut Systems, Inc. Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files
US9875374B2 (en) * 2015-07-01 2018-01-23 Michael L. Brownewell System and method for collecting, storing, and securing data
US9766825B2 (en) 2015-07-22 2017-09-19 Commvault Systems, Inc. Browse and restore for block-level backups
US10311042B1 (en) * 2015-08-31 2019-06-04 Commvault Systems, Inc. Organically managing primary and secondary storage of a data object based on expiry timeframe supplied by a user of the data object
US9811271B2 (en) 2015-11-30 2017-11-07 International Business Machines Corporation Migration of data storage
US9967337B1 (en) * 2015-12-29 2018-05-08 EMC IP Holding Company LLC Corruption-resistant backup policy
US10296368B2 (en) 2016-03-09 2019-05-21 Commvault Systems, Inc. Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount)
US10296593B2 (en) 2016-06-24 2019-05-21 International Business Machines Corporation Managing storage system metadata during data migration
US10838821B2 (en) 2017-02-08 2020-11-17 Commvault Systems, Inc. Migrating content and metadata from a backup system
US10740193B2 (en) 2017-02-27 2020-08-11 Commvault Systems, Inc. Hypervisor-independent reference copies of virtual machine payload data based on block-level pseudo-mount
US10891069B2 (en) 2017-03-27 2021-01-12 Commvault Systems, Inc. Creating local copies of data stored in online data repositories
US10776329B2 (en) 2017-03-28 2020-09-15 Commvault Systems, Inc. Migration of a database management system to cloud storage
US10459631B2 (en) 2017-03-28 2019-10-29 Nicira, Inc. Managing deletion of logical objects of a managed system
US11074140B2 (en) 2017-03-29 2021-07-27 Commvault Systems, Inc. Live browsing of granular mailbox data
US10496599B1 (en) 2017-04-30 2019-12-03 EMC IP Holding Company LLC Cloud data archiving using chunk-object mapping and synthetic full backup
US10664352B2 (en) 2017-06-14 2020-05-26 Commvault Systems, Inc. Live browsing of backed up data residing on cloned disks
US12130798B1 (en) * 2017-06-16 2024-10-29 Amazon Technologies, Inc. Variable reclamation of data copies
US10642809B2 (en) 2017-06-26 2020-05-05 International Business Machines Corporation Import, export, and copy management for tiered object storage
US11341103B2 (en) * 2017-08-04 2022-05-24 International Business Machines Corporation Replicating and migrating files to secondary storage sites
US10846180B2 (en) 2017-09-14 2020-11-24 Commvault Systems, Inc. Distributed framework for task splitting and task assignments in a content indexing system
US11086834B2 (en) 2017-09-14 2021-08-10 Commvault Systems, Inc. Distributed framework for data proximity-based task splitting in a content indexing system
US10846266B2 (en) 2017-09-14 2020-11-24 Commvault Systems, Inc. Distributed architecture for content indexing emails
US11036592B2 (en) 2017-09-14 2021-06-15 Commvault Systems, Inc. Distributed content indexing architecture with separately stored file previews
US11263088B2 (en) 2017-09-14 2022-03-01 Commvault Systems, Inc. Distributed architecture for tracking content indexing
US10742735B2 (en) 2017-12-12 2020-08-11 Commvault Systems, Inc. Enhanced network attached storage (NAS) services interfacing to cloud storage
US10795927B2 (en) 2018-02-05 2020-10-06 Commvault Systems, Inc. On-demand metadata extraction of clinical image data
US10754729B2 (en) 2018-03-12 2020-08-25 Commvault Systems, Inc. Recovery point objective (RPO) driven backup scheduling in a data storage management system
US10789387B2 (en) 2018-03-13 2020-09-29 Commvault Systems, Inc. Graphical representation of an information management system
US10536522B2 (en) * 2018-04-30 2020-01-14 EMC IP Holding Company LLC Data storage system with LUN archiving to cloud using volume-to-object translation
US10868782B2 (en) 2018-07-12 2020-12-15 Bank Of America Corporation System for flagging data transmissions for retention of metadata and triggering appropriate transmission placement
US10860443B2 (en) 2018-12-10 2020-12-08 Commvault Systems, Inc. Evaluation and reporting of recovery readiness in a data storage management system
US11194758B1 (en) * 2019-01-02 2021-12-07 Amazon Technologies, Inc. Data archiving using a compute efficient format in a service provider environment
US11269732B2 (en) 2019-03-12 2022-03-08 Commvault Systems, Inc. Managing structured data in a data storage system
US11308034B2 (en) 2019-06-27 2022-04-19 Commvault Systems, Inc. Continuously run log backup with minimal configuration and resource usage from the source machine
WO2021014324A1 (en) * 2019-07-19 2021-01-28 JFrog Ltd. Data archive release in context of data object
US10999397B2 (en) * 2019-07-23 2021-05-04 Microsoft Technology Licensing, Llc Clustered coherent cloud read cache without coherency messaging
US20210173811A1 (en) 2019-12-04 2021-06-10 Commvault Systems, Inc. Optimizing the restoration of deduplicated data stored in multi-node replicated file systems
US11860835B1 (en) * 2020-06-29 2024-01-02 Amazon Technologies, Inc. Efficient drop column requests in a non-relational data store
US11500566B2 (en) 2020-08-25 2022-11-15 Commvault Systems, Inc. Cloud-based distributed data storage system using block-level deduplication based on backup frequencies of incoming backup copies
US12088583B2 (en) * 2020-11-11 2024-09-10 Hewlett Packard Enterprise Development Lp Permissions for backup-related operations
US11966360B2 (en) * 2021-01-04 2024-04-23 Bank Of America Corporation System for optimized archival using data detection and classification model
US12105812B2 (en) 2022-04-19 2024-10-01 Bank Of America Corporation System and method for providing complex data encryption
US12095754B2 (en) 2022-04-20 2024-09-17 Bank Of America Corporation System and method for establishing a secure session to authenticate DNS requests via dynamically configurable trusted network interface controllers

Citations (422)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4686620A (en) 1984-07-26 1987-08-11 American Telephone And Telegraph Company, At&T Bell Laboratories Database backup method
US4713755A (en) 1985-06-28 1987-12-15 Hewlett-Packard Company Cache memory consistency control with explicit software instructions
EP0259912A1 (en) 1986-09-12 1988-03-16 Hewlett-Packard Limited File backup facility for a community of personal computers
EP0405926A2 (en) 1989-06-30 1991-01-02 Digital Equipment Corporation Method and apparatus for managing a shadow set of storage media
US4995035A (en) 1988-10-31 1991-02-19 International Business Machines Corporation Centralized management in a computer network
US5005122A (en) 1987-09-08 1991-04-02 Digital Equipment Corporation Arrangement with cooperating management server node and network service node
EP0467546A2 (en) 1990-07-18 1992-01-22 International Computers Limited Distributed data processing systems
US5093912A (en) 1989-06-26 1992-03-03 International Business Machines Corporation Dynamic resource pool expansion and contraction in multiprocessing environments
US5133065A (en) 1989-07-27 1992-07-21 Personal Computer Peripherals Corporation Backup computer program for networks
US5193154A (en) 1987-07-10 1993-03-09 Hitachi, Ltd. Buffered peripheral system and method for backing up and retrieving data to and from backup memory device
US5212772A (en) 1991-02-11 1993-05-18 Gigatrend Incorporated System for storing data in backup tape device
US5226157A (en) 1988-03-11 1993-07-06 Hitachi, Ltd. Backup control method and system in data processing system using identifiers for controlling block data transfer
US5239647A (en) 1990-09-07 1993-08-24 International Business Machines Corporation Data storage hierarchy with shared storage level
US5241668A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated termination and resumption in a time zero backup copy process
US5241670A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated backup copy ordering in a time zero backup copy session
US5276867A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data storage system with improved data migration
US5276860A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data processor with improved backup storage
US5287500A (en) 1991-06-03 1994-02-15 Digital Equipment Corporation System for allocating storage spaces based upon required and optional service attributes having assigned piorities
US5321816A (en) 1989-10-10 1994-06-14 Unisys Corporation Local-remote apparatus with specialized image storage modules
US5333315A (en) 1991-06-27 1994-07-26 Digital Equipment Corporation System of device independent file directories using a tag between the directories and file descriptors that migrate with the files
US5347653A (en) 1991-06-28 1994-09-13 Digital Equipment Corporation System for reconstructing prior versions of indexes using records indicating changes between successive versions of the indexes
US5410700A (en) 1991-09-04 1995-04-25 International Business Machines Corporation Computer system which supports asynchronous commitment of data
WO1995013580A1 (en) 1993-11-09 1995-05-18 Arcada Software Data backup and restore system for a computer network
US5437012A (en) 1993-04-19 1995-07-25 Canon Information Systems, Inc. System for updating directory information and data on write once media such as an optical memory card
US5448724A (en) 1993-07-02 1995-09-05 Fujitsu Limited Data processing system having double supervising functions
US5491810A (en) 1994-03-01 1996-02-13 International Business Machines Corporation Method and system for automated data storage system space allocation utilizing prioritized data set parameters
US5495607A (en) 1993-11-15 1996-02-27 Conner Peripherals, Inc. Network management system having virtual catalog overview of files distributively stored across network domain
US5504873A (en) 1989-11-01 1996-04-02 E-Systems, Inc. Mass data storage and retrieval system
US5544345A (en) 1993-11-08 1996-08-06 International Business Machines Corporation Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage
US5544347A (en) 1990-09-24 1996-08-06 Emc Corporation Data storage system controlled remote data mirroring with respectively maintained data indices
US5559957A (en) 1995-05-31 1996-09-24 Lucent Technologies Inc. File system for a data storage device having a power fail recovery mechanism for write/replace operations
US5604862A (en) 1995-03-14 1997-02-18 Network Integrity, Inc. Continuously-snapshotted protection of computer files
US5606686A (en) 1993-10-22 1997-02-25 Hitachi, Ltd. Access control method for a shared main memory in a multiprocessor based upon a directory held at a storage location of data in the memory after reading data to a processor
US5619644A (en) 1995-09-18 1997-04-08 International Business Machines Corporation Software directed microcode state save for distributed storage controller
US5628004A (en) 1994-11-04 1997-05-06 Optima Direct, Inc. System for managing database of communication of recipients
EP0774715A1 (en) 1995-10-23 1997-05-21 Stac Electronics System for backing up files from disk volumes on multiple nodes of a computer network
US5634052A (en) 1994-10-24 1997-05-27 International Business Machines Corporation System for reducing storage requirements and transmission loads in a backup subsystem in client-server environment by transmitting only delta files from client to server
US5638509A (en) 1994-06-10 1997-06-10 Exabyte Corporation Data storage and protection system
US5673381A (en) 1994-05-27 1997-09-30 Cheyenne Software International Sales Corp. System and parallel streaming and data stripping to back-up a network
EP0809184A1 (en) 1996-05-23 1997-11-26 International Business Machines Corporation Availability and recovery of files using copy storage pools
US5699361A (en) 1995-07-18 1997-12-16 Industrial Technology Research Institute Multimedia channel formulation mechanism
US5729743A (en) 1995-11-17 1998-03-17 Deltatech Research, Inc. Computer apparatus and method for merging system deltas
US5751997A (en) 1993-01-21 1998-05-12 Apple Computer, Inc. Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked computer environment
US5758359A (en) 1996-10-24 1998-05-26 Digital Equipment Corporation Method and apparatus for performing retroactive backups in a computer system
US5761677A (en) 1996-01-03 1998-06-02 Sun Microsystems, Inc. Computer system method and apparatus providing for various versions of a file without requiring data copy or log operations
US5764972A (en) 1993-02-01 1998-06-09 Lsc, Inc. Archiving file system for data servers in a distributed network environment
US5794229A (en) 1993-04-16 1998-08-11 Sybase, Inc. Database system with methodology for storing a database table by vertically partitioning all columns of the table
US5813008A (en) 1996-07-12 1998-09-22 Microsoft Corporation Single instance storage of information
US5813009A (en) 1995-07-28 1998-09-22 Univirtual Corp. Computer based records management system method
US5813017A (en) 1994-10-24 1998-09-22 International Business Machines Corporation System and method for reducing storage requirement in backup subsystems utilizing segmented compression and differencing
US5812398A (en) 1996-06-10 1998-09-22 Sun Microsystems, Inc. Method and system for escrowed backup of hotelled world wide web sites
US5822780A (en) 1996-12-31 1998-10-13 Emc Corporation Method and apparatus for hierarchical storage management for data base management systems
US5862325A (en) * 1996-02-29 1999-01-19 Intermind Corporation Computer-based communication system and method using metadata defining a control structure
US5875478A (en) 1996-12-03 1999-02-23 Emc Corporation Computer backup using a file system, network, disk, tape and remote archiving repository media system
EP0899662A1 (en) 1997-08-29 1999-03-03 Hewlett-Packard Company Backup and restore system for a computer network
WO1999012098A1 (en) 1997-08-29 1999-03-11 Hewlett-Packard Company Data backup and recovery systems
US5887134A (en) 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
US5901327A (en) 1996-05-28 1999-05-04 Emc Corporation Bundling of write data from channel commands in a command chain for transmission over a data link between data storage systems for remote data mirroring
US5924102A (en) 1997-05-07 1999-07-13 International Business Machines Corporation System and method for managing critical files
US5940833A (en) 1996-07-12 1999-08-17 Microsoft Corporation Compressing sets of integers
US5950205A (en) 1997-09-25 1999-09-07 Cisco Technology, Inc. Data transmission over the internet using a cache memory file system
US5974563A (en) 1995-10-16 1999-10-26 Network Specialists, Inc. Real time backup system
US5990810A (en) 1995-02-17 1999-11-23 Williams; Ross Neil Method for partitioning a block of data into subblocks and for storing and communcating such subblocks
US6021415A (en) 1997-10-29 2000-02-01 International Business Machines Corporation Storage management system with file aggregation and space reclamation within aggregated files
US6026414A (en) 1998-03-05 2000-02-15 International Business Machines Corporation System including a proxy client to backup files in a distributed computing environment
EP0981090A1 (en) 1998-08-17 2000-02-23 Connected Place Limited A method of producing a checkpoint which describes a base file and a method of generating a difference file defining differences between an updated file and a base file
US6052735A (en) 1997-10-24 2000-04-18 Microsoft Corporation Electronic mail object synchronization between a desktop computer and mobile device
US6073133A (en) 1998-05-15 2000-06-06 Micron Electronics Inc. Electronic mail attachment verifier
US6076148A (en) 1997-12-26 2000-06-13 Emc Corporation Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information stored on mass storage subsystem
US6094416A (en) 1997-05-09 2000-07-25 I/O Control Corporation Multi-tier architecture for control network
US6125369A (en) 1997-10-02 2000-09-26 Microsoft Corporation Continuous object sychronization between object stores on different computers
US6131095A (en) 1996-12-11 2000-10-10 Hewlett-Packard Company Method of accessing a target entity over a communications network
US6131190A (en) 1997-12-18 2000-10-10 Sidwell; Leland P. System for modifying JCL parameters to optimize data storage allocations
US6154787A (en) 1998-01-21 2000-11-28 Unisys Corporation Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed
US6161111A (en) 1998-03-31 2000-12-12 Emc Corporation System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map
US6167402A (en) 1998-04-27 2000-12-26 Sun Microsystems, Inc. High performance message store
US6173291B1 (en) 1997-09-26 2001-01-09 Powerquest Corporation Method and apparatus for recovering data from damaged or corrupted file storage media
US6212512B1 (en) 1999-01-06 2001-04-03 Hewlett-Packard Company Integration of a database into file management software for protecting, tracking and retrieving data
US6260069B1 (en) 1998-02-10 2001-07-10 International Business Machines Corporation Direct data retrieval in a distributed computing system
US6269431B1 (en) 1998-08-13 2001-07-31 Emc Corporation Virtual storage and block level direct access of secondary storage for recovery of backup data
US6275953B1 (en) 1997-09-26 2001-08-14 Emc Corporation Recovery from failure of a data processor in a network server
US6301592B1 (en) 1997-11-05 2001-10-09 Hitachi, Ltd. Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program is recorded
US6311252B1 (en) 1997-06-30 2001-10-30 Emc Corporation Method and apparatus for moving data between storage levels of a hierarchically arranged data storage system
US20010037323A1 (en) 2000-02-18 2001-11-01 Moulton Gregory Hagan Hash file system and method for use in a commonality factoring system
US6324544B1 (en) 1998-10-21 2001-11-27 Microsoft Corporation File object synchronization between a desktop computer and a mobile device
US6324581B1 (en) 1999-03-03 2001-11-27 Emc Corporation File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems
US6330570B1 (en) 1998-03-02 2001-12-11 Hewlett-Packard Company Data backup system
US6330642B1 (en) 2000-06-29 2001-12-11 Bull Hn Informatin Systems Inc. Three interconnected raid disk controller data processing system architecture
US6328766B1 (en) 1997-01-23 2001-12-11 Overland Data, Inc. Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host
US6343324B1 (en) 1999-09-13 2002-01-29 International Business Machines Corporation Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access to the devices
US6356915B1 (en) 1999-02-22 2002-03-12 Starbase Corp. Installable file system having virtual file system drive, virtual device driver, and virtual disks
US6356801B1 (en) 2000-05-19 2002-03-12 International Business Machines Corporation High availability work queuing in an automated data storage library
USRE37601E1 (en) 1992-04-20 2002-03-19 International Business Machines Corporation Method and system for incremental time zero backup copying of data
US20020055972A1 (en) 2000-05-08 2002-05-09 Weinman Joseph Bernard Dynamic content distribution and data continuity architecture
US6389432B1 (en) 1999-04-05 2002-05-14 Auspex Systems, Inc. Intelligent virtual volume access
US20020065892A1 (en) 2000-11-30 2002-05-30 Malik Dale W. Method and apparatus for minimizing storage of common attachment files in an e-mail communications server
US6418478B1 (en) 1997-10-30 2002-07-09 Commvault Systems, Inc. Pipelined high speed data transfer mechanism
US6421711B1 (en) 1998-06-29 2002-07-16 Emc Corporation Virtual ports for data transferring of a data storage system
US20020099806A1 (en) 2000-11-30 2002-07-25 Phillip Balsamo Processing node for eliminating duplicate network usage data
US6477544B1 (en) 1999-07-16 2002-11-05 Microsoft Corporation Single instance store for file systems
US6487561B1 (en) 1998-12-31 2002-11-26 Emc Corporation Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size
US20030004922A1 (en) 2001-06-27 2003-01-02 Ontrack Data International, Inc. System and method for data management
US6513051B1 (en) 1999-07-16 2003-01-28 Microsoft Corporation Method and system for backing up and restoring files stored in a single instance store
US6519679B2 (en) 1999-06-11 2003-02-11 Dell Usa, L.P. Policy based storage configuration
US6538669B1 (en) 1999-07-15 2003-03-25 Dell Products L.P. Graphical user interface for configuration of a storage system
WO2003027891A1 (en) 2001-09-28 2003-04-03 Commvault Systems, Inc. System and method for archiving objects in an information store
US20030074600A1 (en) 2000-04-12 2003-04-17 Masaharu Tamatsu Data backup/recovery system
US6564228B1 (en) 2000-01-14 2003-05-13 Sun Microsystems, Inc. Method of enabling heterogeneous platforms to utilize a universal file system in a storage area network
US20030110190A1 (en) 2001-12-10 2003-06-12 Hitachi, Ltd. Method and system for file space management
US20030135480A1 (en) 2002-01-14 2003-07-17 Van Arsdale Robert S. System for updating a database
US6609157B2 (en) 1998-09-22 2003-08-19 Microsoft Corporation Method and apparatus for bundling messages at the expiration of a time-limit
US6609183B2 (en) 1999-02-23 2003-08-19 Legato Systems, Inc. Method and system for mirroring and archiving mass storage
US6609187B1 (en) 1999-07-06 2003-08-19 Dell Products L.P. Method and apparatus for supporting resizing of file system partitions
US20030167318A1 (en) 2001-10-22 2003-09-04 Apple Computer, Inc. Intelligent synchronization of media player with host computer
US20030172368A1 (en) 2001-12-26 2003-09-11 Elizabeth Alumbaugh System and method for autonomously generating heterogeneous data source interoperability bridges based on semantic modeling derived from self adapting ontology
US20030177149A1 (en) 2002-03-18 2003-09-18 Coombs David Lawrence System and method for data backup
US6658526B2 (en) 1997-03-12 2003-12-02 Storage Technology Corporation Network attached virtual data storage subsystem
US20030236763A1 (en) 2002-06-25 2003-12-25 Alan Kilduff Electronic message filing system
US6675177B1 (en) 2000-06-21 2004-01-06 Teradactyl, Llc Method and system for backing up digital data
US6708195B1 (en) 1998-10-02 2004-03-16 International Business Machines Corporation Composite locking of objects in a database
US6745304B2 (en) 2001-02-15 2004-06-01 Alcatel Method and device for storing computer data with back-up operations
US6757699B2 (en) 2000-10-06 2004-06-29 Franciscan University Of Steubenville Method and system for fragmenting and reconstituting data
US6757794B2 (en) 1999-08-20 2004-06-29 Microsoft Corporation Buffering data in a hierarchical data storage environment
US20040128287A1 (en) 2002-12-20 2004-07-01 International Business Machines Corporation Self tuning database retrieval optimization using regression functions
US20040177319A1 (en) * 2002-07-16 2004-09-09 Horn Bruce L. Computer system for automatic organization, indexing and viewing of information from multiple sources
US6795903B2 (en) 2002-01-17 2004-09-21 Thomas Licensing S.A. System and method for searching for duplicate data
US6810398B2 (en) 2000-11-06 2004-10-26 Avamar Technologies, Inc. System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences
US20040220975A1 (en) 2003-02-21 2004-11-04 Hypertrust Nv Additional hash functions in content-based addressing
US20040230817A1 (en) 2003-05-14 2004-11-18 Kenneth Ma Method and system for disaster recovery of data from a storage device
US6839819B2 (en) 2001-12-28 2005-01-04 Storage Technology Corporation Data management appliance
US20050033756A1 (en) 2003-04-03 2005-02-10 Rajiv Kottomtharayil System and method for dynamically sharing storage volumes in a computer network
US6862674B2 (en) 2002-06-06 2005-03-01 Sun Microsystems Methods and apparatus for performing a memory management technique
US6865655B1 (en) 2002-07-30 2005-03-08 Sun Microsystems, Inc. Methods and apparatus for backing up and restoring data portions stored in client computer systems
US6868417B2 (en) 2000-12-18 2005-03-15 Spinnaker Networks, Inc. Mechanism for handling file level and block level remote file accesses using the same server
US20050060643A1 (en) 2003-08-25 2005-03-17 Miavia, Inc. Document similarity detection and classification system
US20050066190A1 (en) 2003-01-02 2005-03-24 Cricket Technologies Llc Electronic archive filter and profiling apparatus, system, method, and electronically stored computer program product
US6889297B2 (en) 2001-03-23 2005-05-03 Sun Microsystems, Inc. Methods and systems for eliminating data redundancies
US20050097150A1 (en) 2003-11-03 2005-05-05 Mckeon Adrian J. Data aggregation
US20050108435A1 (en) 2003-08-21 2005-05-19 Nowacki Todd A. Method and system for electronic archival and retrieval of electronic communications
US6901493B1 (en) 1998-02-24 2005-05-31 Adaptec, Inc. Method for protecting data of a computer system
US20050131961A1 (en) 2000-02-18 2005-06-16 Margolus Norman H. Data repository and method for promoting network storage of data
US20050138081A1 (en) 2003-05-14 2005-06-23 Alshab Melanie A. Method and system for reducing information latency in a business enterprise
US6912645B2 (en) 2001-07-19 2005-06-28 Lucent Technologies Inc. Method and apparatus for archival data storage
US6928459B1 (en) 2000-07-18 2005-08-09 International Business Machines Corporation Plurality of file systems using weighted allocation to allocate space on one or more storage devices
US20050195660A1 (en) 2004-02-11 2005-09-08 Kavuri Ravi K. Clustered hierarchical file services
US20050203887A1 (en) 2004-03-12 2005-09-15 Solix Technologies, Inc. System and method for seamless access to multiple data sources
US20050210460A1 (en) 2004-03-22 2005-09-22 Microsoft Corporation Computing device with relatively limited storage space and operating/file system thereof
US6952758B2 (en) 2002-07-31 2005-10-04 International Business Machines Corporation Method and system for providing consistent data modification information to clients in a storage system
US20050234823A1 (en) 2004-04-20 2005-10-20 Rainer Schimpf Systems and methods to prevent products from counterfeiting and surplus production also of tracking their way of distribution.
US6959368B1 (en) 1999-06-29 2005-10-25 Emc Corporation Method and apparatus for duplicating computer backup data
US20050254072A1 (en) 2004-05-12 2005-11-17 Canon Kabushiki Kaisha Image data processing method, client terminal, image processing program, image data management method and image management system
US20050262193A1 (en) 2003-08-27 2005-11-24 Ascential Software Corporation Logging service for a services oriented architecture in a data integration platform
US6973553B1 (en) 2000-10-20 2005-12-06 International Business Machines Corporation Method and apparatus for using extended disk sector formatting to assist in backup and hierarchical storage management
US6976039B2 (en) 2001-05-25 2005-12-13 International Business Machines Corporation Method and system for processing backup data associated with application, querying metadata files describing files accessed by the application
US20050283461A1 (en) 2004-06-02 2005-12-22 Jorg-Stefan Sell Method and apparatus for managing electronic messages
US20050286466A1 (en) 2000-11-03 2005-12-29 Tagg James P System for providing mobile VoIP
US20060005048A1 (en) 2004-07-02 2006-01-05 Hitachi Ltd. Method and apparatus for encrypted remote copy for secure data backup and restoration
US20060010227A1 (en) 2004-06-01 2006-01-12 Rajeev Atluri Methods and apparatus for accessing data from a primary data storage system for secondary storage
US20060047894A1 (en) 2004-08-30 2006-03-02 Fujitsu Limited Data recording apparatus, and data recording control method and program
US20060047978A1 (en) 1999-02-17 2006-03-02 Sony Corporation Information processing apparatus and method, and program storage medium
US20060053305A1 (en) 2004-09-09 2006-03-09 Microsoft Corporation Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system
US20060056623A1 (en) 2000-01-31 2006-03-16 Vdg, Inc. Block encryption method and schemes for data confidentiality and integrity protection
US7017113B2 (en) 2002-01-25 2006-03-21 The United States Of America As Represented By The Secretary Of The Air Force Method and apparatus for removing redundant information from digital documents
US7035943B2 (en) 1998-05-29 2006-04-25 Yahoo! Inc. Web server content replication
US7035876B2 (en) 2001-03-19 2006-04-25 Attenex Corporation System and method for evaluating a structured message store for message redundancy
US7035880B1 (en) 1999-07-14 2006-04-25 Commvault Systems, Inc. Modular backup and retrieval system used in conjunction with a storage area network
US20060089954A1 (en) 2002-05-13 2006-04-27 Anschutz Thomas A Scalable common access back-up architecture
US20060095470A1 (en) 2004-11-04 2006-05-04 Cochran Robert A Managing a file in a network environment
WO2006052872A2 (en) 2004-11-05 2006-05-18 Commvault Systems, Inc. System and method to support single instance storage operations
US20060126615A1 (en) 2004-12-10 2006-06-15 Angtin Matthew J Transferring data among a logical layer, physical layer, and storage device
US20060129576A1 (en) 1998-01-23 2006-06-15 Emc Corporation Access to content addressable data over a network
US20060129771A1 (en) 2004-12-14 2006-06-15 International Business Machines Corporation Managing data migration
US7085904B2 (en) 2003-10-20 2006-08-01 Hitachi, Ltd. Storage system and method for backup
US20060174112A1 (en) 2004-02-27 2006-08-03 Bae Systems (Defence Systems) Limited Secure computer communication
US7089395B2 (en) 2002-10-03 2006-08-08 Hewlett-Packard Development Company, L.P. Computer systems, virtual storage systems and virtual storage system operational methods
US7089383B2 (en) 2003-06-06 2006-08-08 Hewlett-Packard Development Company, L.P. State machine and system for data redundancy
US7092956B2 (en) 2001-11-02 2006-08-15 General Electric Capital Corporation Deduplication system
US7103740B1 (en) 2003-12-31 2006-09-05 Veritas Operating Corporation Backup mechanism for a multi-class file system
US20060206547A1 (en) 2005-02-08 2006-09-14 Raghavendra Kulkarni Storing and retrieving computer data files using an encrypted network drive file system
US20060206621A1 (en) 2005-03-08 2006-09-14 John Toebes Movement of data in a distributed database system to a storage location closest to a center of activity for the data
US7111173B1 (en) 1998-09-01 2006-09-19 Tecsec, Inc. Encryption process including a biometric unit
US7117246B2 (en) 2000-02-22 2006-10-03 Sendmail, Inc. Electronic mail system with methodology providing distributed message store
US20060230081A1 (en) 2002-10-10 2006-10-12 Craswell Ronald J Backing up a wireless computing device
US20060230244A1 (en) * 2004-11-08 2006-10-12 Amarendran Arun P System and method for performing auxillary storage operations
US20060259587A1 (en) 2005-03-21 2006-11-16 Ackerman Steve F Conserving file system with backup and validation
US7139808B2 (en) 2002-04-30 2006-11-21 Intel Corporation Method and apparatus for bandwidth-efficient and storage-efficient backups
US7143108B1 (en) 2000-04-06 2006-11-28 International Business Machines Corporation Apparatus and method for deletion of objects from an object-relational system in a customizable and database independent manner
US7143091B2 (en) 2002-02-04 2006-11-28 Cataphorn, Inc. Method and apparatus for sociological data mining
US20070022145A1 (en) 2001-11-23 2007-01-25 Srinivas Kavuri Selective data replication system and method
US7191290B1 (en) 2002-09-16 2007-03-13 Network Appliance, Inc. Apparatus and method for tandem operation in a storage network
US20070067399A1 (en) 2005-09-22 2007-03-22 Raghavendra Kulkarni Electronic mail archiving system and method
US7200604B2 (en) 2004-02-17 2007-04-03 Hewlett-Packard Development Company, L.P. Data de-duplication
US7200621B2 (en) 2003-12-17 2007-04-03 International Business Machines Corporation System to automate schema creation for table restore
US20070079170A1 (en) 2005-09-30 2007-04-05 Zimmer Vincent J Data migration in response to predicted disk failure
US20070106863A1 (en) 2005-11-04 2007-05-10 Sun Microsystems, Inc. Method and system for storing a sparse file using fill counts
US20070118573A1 (en) 2005-11-23 2007-05-24 Solix, Inc. System and method to create a subset of a database
US20070136200A1 (en) 2005-12-09 2007-06-14 Microsoft Corporation Backup broker for private, integral and affordable distributed storage
US20070156998A1 (en) 2005-12-21 2007-07-05 Gorobets Sergey A Methods for memory allocation in non-volatile memories with a directly mapped file storage system
US7246272B2 (en) 2004-01-16 2007-07-17 International Business Machines Corporation Duplicate network address detection
US20070179995A1 (en) 2005-11-28 2007-08-02 Anand Prahlad Metabase for facilitating data classification
US7272606B2 (en) 2003-11-26 2007-09-18 Veritas Operating Corporation System and method for detecting and storing file content access information within a file system
US20070226535A1 (en) * 2005-12-19 2007-09-27 Parag Gokhale Systems and methods of unified reconstruction in storage systems
US7277941B2 (en) 1998-03-11 2007-10-02 Commvault Systems, Inc. System and method for providing encryption in a storage network by storing a secured encryption key with encrypted archive data in an archive storage device
US20070233638A1 (en) * 2006-03-31 2007-10-04 International Business Machines Corporation Method and system for providing cost model data for tuning of query cache memory in databases
US7287252B2 (en) 2002-09-27 2007-10-23 The United States Of America Represented By The Secretary Of The Navy Universal client and consumer
US7290102B2 (en) 2001-06-01 2007-10-30 Hewlett-Packard Development Company, L.P. Point in time storage copy
US20070260476A1 (en) 2006-05-05 2007-11-08 Lockheed Martin Corporation System and method for immutably cataloging electronic assets in a large-scale computer system
US20070271316A1 (en) 2006-05-22 2007-11-22 I3Archives, Inc. System and method for backing up medical records
US20070288534A1 (en) 2006-06-07 2007-12-13 Dorota Zak Backup and recovery of integrated linked databases
US7310655B2 (en) 2000-07-31 2007-12-18 Microsoft Corporation Method and system for concurrent garbage collection
US7315923B2 (en) 2003-11-13 2008-01-01 Commvault Systems, Inc. System and method for combining data streams in pipelined storage operations in a storage network
US7320059B1 (en) 2005-08-26 2008-01-15 Emc Corporation Methods and apparatus for deleting content from a storage system
US7325110B2 (en) 2003-12-19 2008-01-29 Hitachi, Ltd. Method for acquiring snapshot
US7330997B1 (en) 2004-06-03 2008-02-12 Gary Odom Selective reciprocal backup
US20080047935A1 (en) 2004-04-01 2008-02-28 Christian Schmidt Manufacturing and Use of Microperforated Substrates
US7343459B2 (en) 2004-04-30 2008-03-11 Commvault Systems, Inc. Systems and methods for detecting & mitigating storage risks
US20080082714A1 (en) 2006-09-29 2008-04-03 Nasa Hq's. Systems, methods and apparatus for flash drive
US20080082736A1 (en) 2004-03-11 2008-04-03 Chow David Q Managing bad blocks in various flash memory cells for electronic data flash card
US20080098083A1 (en) 2006-10-19 2008-04-24 Oracle International Corporation System and method for data de-duplication
US7370003B2 (en) 2002-11-08 2008-05-06 Amdocs Software Systems Ltd. Method and apparatus for implied attribution of responses to promotional contacts
US7376805B2 (en) 2006-04-21 2008-05-20 Hewlett-Packard Development Company, L.P. Distributed storage array
US20080126543A1 (en) 2006-11-29 2008-05-29 Hamada Gen Data Management Server, Data Management System, Data Management Method, and Program
US7383304B2 (en) 2002-02-12 2008-06-03 Canon Kabushiki Kaisha System, method, program and storage medium for processing electronic mail
WO2008070688A1 (en) 2006-12-04 2008-06-12 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
US7389345B1 (en) 2003-03-26 2008-06-17 Sprint Communications Company L.P. Filtering approach for network system alarms
US7395282B1 (en) 1999-07-15 2008-07-01 Commvault Systems, Inc. Hierarchical backup and retrieval system
WO2008080140A2 (en) 2006-12-22 2008-07-03 Commvault Systems, Inc. System and method for storing redundant information
US20080162518A1 (en) 2007-01-03 2008-07-03 International Business Machines Corporation Data aggregation and grooming in multiple geo-locations
US20080162597A1 (en) 2006-12-27 2008-07-03 Research In Motion Limited Method and apparatus for synchronizing databases connected by wireless interface
US7403942B1 (en) 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7409522B1 (en) 2005-10-26 2008-08-05 Network Appliance, Inc. Method and system for reallocating data in a file system
US20080244204A1 (en) 2007-03-29 2008-10-02 Nick Cremelie Replication and restoration of single-instance storage pools
US20080243769A1 (en) 2007-03-30 2008-10-02 Symantec Corporation System and method for exporting data directly from deduplication storage to non-deduplication storage
US20080244172A1 (en) 2007-03-29 2008-10-02 Yoshiki Kano Method and apparatus for de-duplication after mirror operation
US20080243914A1 (en) 2006-12-22 2008-10-02 Anand Prahlad System and method for storing redundant information
US7440982B2 (en) 2003-11-13 2008-10-21 Commvault Systems, Inc. System and method for stored data archive verification
US7444387B2 (en) 2001-06-06 2008-10-28 Microsoft Corporation Locating potentially identical objects across multiple computers based on stochastic partitioning of workload
US7451166B2 (en) 2005-01-13 2008-11-11 International Business Machines Corporation System and method for maintaining checkpoints of a keyed data structure using a sequential log
US20080307000A1 (en) 2007-06-08 2008-12-11 Toby Charles Wood Paterson Electronic Backup of Applications
US20090012984A1 (en) 2007-07-02 2009-01-08 Equivio Ltd. Method for Organizing Large Numbers of Documents
US7478113B1 (en) 2006-04-13 2009-01-13 Symantec Operating Corporation Boundaries
US7478096B2 (en) 2003-02-26 2009-01-13 Burnside Acquisition, Llc History preservation in a computer storage system
US7480782B2 (en) 2006-06-14 2009-01-20 Sun Microsystems, Inc. Reference-updating using per-chunk referenced-address ranges in a compacting garbage collector
US7493314B2 (en) 2005-01-10 2009-02-17 Cyberlink Corp. System and method for providing access to computer files across computer operating systems
US7493456B2 (en) 2006-10-13 2009-02-17 International Business Machines Corporation Memory queue with supplemental locations for consecutive addresses
US20090049260A1 (en) 2007-08-13 2009-02-19 Upadhyayula Shivarama Narasimh High performance data deduplication in a virtual tape system
US7496604B2 (en) 2001-12-03 2009-02-24 Aol Llc Reducing duplication of files on a network
US20090083341A1 (en) 2007-09-21 2009-03-26 International Business Machines Corporation Ensuring that the archival data deleted in relational source table is already stored in relational target table
US20090083344A1 (en) 2007-09-26 2009-03-26 Hitachi, Ltd. Computer system, management computer, and file management method for file consolidation
US7512745B2 (en) 2006-04-28 2009-03-31 International Business Machines Corporation Method for garbage collection in heterogeneous multiprocessor systems
US7516208B1 (en) 2001-07-20 2009-04-07 International Business Machines Corporation Event database management method and system for network event reporting system
US7519726B2 (en) 2003-12-12 2009-04-14 International Business Machines Corporation Methods, apparatus and computer programs for enhanced access to resources within a network
US20090106369A1 (en) 2007-10-18 2009-04-23 Yen-Fu Chen Duplicate email address detection for a contact
US20090112870A1 (en) 2007-10-31 2009-04-30 Microsoft Corporation Management of distributed storage
US20090119678A1 (en) 2007-11-02 2009-05-07 Jimmy Shih Systems and methods for supporting downloadable applications on a portable client device
US7533331B2 (en) 2004-11-22 2009-05-12 Research In Motion Limited System and method for securely adding redundancy to an electronic message
US7536440B2 (en) 2003-09-18 2009-05-19 Vulcan Portals Inc. Method and system for email synchronization for an electronic device
US20090150498A1 (en) 2007-12-07 2009-06-11 Steven Joseph Branda Identifying a Plurality of Related Electronic Messages and Combining the Plurality of Related Messages Into a Composite View
US7568080B2 (en) 2002-10-07 2009-07-28 Commvault Systems, Inc. Snapshot storage and management system with indexing and user interface
US20090204636A1 (en) 2008-02-11 2009-08-13 Microsoft Corporation Multimodal object de-duplication
US20090204650A1 (en) 2007-11-15 2009-08-13 Attune Systems, Inc. File Deduplication using Copy-on-Write Storage Tiers
US7577687B2 (en) 2005-03-31 2009-08-18 Ubs Ag Systems and methods for synchronizing databases
US20090228446A1 (en) 2008-03-06 2009-09-10 Hitachi, Ltd. Method for controlling load balancing in heterogeneous computer system
US7590639B1 (en) 2004-04-29 2009-09-15 Sap Ag System and method for ordering a database flush sequence at transaction commit
US7603529B1 (en) 2006-03-22 2009-10-13 Emc Corporation Methods, systems, and computer program products for mapped logical unit (MLU) replications, storage, and retrieval in a redundant array of inexpensive disks (RAID) environment
US20090271454A1 (en) 2008-04-29 2009-10-29 International Business Machines Corporation Enhanced method and system for assuring integrity of deduplicated data
US20090268903A1 (en) 2008-04-25 2009-10-29 Netapp, Inc. Network storage server with integrated encryption, compression and deduplication capability
US7613748B2 (en) 2003-11-13 2009-11-03 Commvault Systems, Inc. Stored data reverification management system and method
US7617297B2 (en) 2004-07-26 2009-11-10 International Business Machines Corporation Providing archiving of individual mail content while maintaining a single copy mail store
US20090281847A1 (en) 2008-05-08 2009-11-12 International Business Machines Corporation (Ibm) Method and System For Data Disaggregation
US7631120B2 (en) 2004-08-24 2009-12-08 Symantec Operating Corporation Methods and apparatus for optimally selecting a storage buffer for the storage of data
US7636824B1 (en) 2006-06-28 2009-12-22 Acronis Inc. System and method for efficient backup using hashes
US20090319534A1 (en) 2008-06-24 2009-12-24 Parag Gokhale Application-aware and remote single instance data management
US20090327625A1 (en) 2008-06-30 2009-12-31 International Business Machines Corporation Managing metadata for data blocks used in a deduplication system
US7647462B2 (en) 2003-09-25 2010-01-12 International Business Machines Corporation Method, system, and program for data synchronization between a primary storage device and a secondary storage device by determining whether a first identifier and a second identifier match, where a unique identifier is associated with each portion of data
US7661028B2 (en) 2005-12-19 2010-02-09 Commvault Systems, Inc. Rolling cache configuration for a data replication system
US20100036887A1 (en) 2008-08-05 2010-02-11 International Business Machines Corporation Efficient transfer of deduplicated data
US7672981B1 (en) 2007-02-28 2010-03-02 Emc Corporation Object classification and indexing of very large name spaces using grid technology
US7672779B2 (en) 2005-11-10 2010-03-02 Tele Atlas North America Inc. System and method for using universal location referencing objects to provide geographic item information
US7676590B2 (en) 2004-05-03 2010-03-09 Microsoft Corporation Background transcoding
US7685384B2 (en) 2004-02-06 2010-03-23 Globalscape, Inc. System and method for replicating files in a computer network
US7685126B2 (en) 2001-08-03 2010-03-23 Isilon Systems, Inc. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7685177B1 (en) 2006-10-03 2010-03-23 Emc Corporation Detecting and managing orphan files between primary and secondary data stores
US7685459B1 (en) 2006-04-13 2010-03-23 Symantec Operating Corporation Parallel backup
US20100082529A1 (en) 2008-05-08 2010-04-01 Riverbed Technology, Inc. Log Structured Content Addressable Deduplicating Storage
US20100082672A1 (en) 2008-09-26 2010-04-01 Rajiv Kottomtharayil Systems and methods for managing single instancing data
US20100088296A1 (en) 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US7721292B2 (en) 2004-12-16 2010-05-18 International Business Machines Corporation System for adjusting resource allocation to a logical partition based on rate of page swaps and utilization by changing a boot configuration file
US20100138500A1 (en) 2008-12-03 2010-06-03 Microsoft Corporation Online Archiving of Message Objects
US7734581B2 (en) 2004-05-18 2010-06-08 Oracle International Corporation Vector reads for array updates
US7739381B2 (en) 1998-03-11 2010-06-15 Commvault Systems, Inc. System and method for providing encryption in storage operations in a storage network, such as for use by application service providers that provide data storage services
US7747659B2 (en) 2004-01-05 2010-06-29 International Business Machines Corporation Garbage collector with eager read barrier
US7747584B1 (en) 2006-08-22 2010-06-29 Netapp, Inc. System and method for enabling de-duplication in a storage system architecture
US7778979B2 (en) 2002-03-26 2010-08-17 Nokia Siemens Networks Oy Method and apparatus for compressing log record information
US7788230B2 (en) 2007-01-23 2010-08-31 International Business Machines Corporation Backing-up and restoring files including files referenced with multiple file names
US7786881B2 (en) 2004-09-17 2010-08-31 Koninklijke Philips Electronics N.V. Content status provision related to volatile memories
US7814142B2 (en) 2003-08-27 2010-10-12 International Business Machines Corporation User interface service for a services oriented architecture in a data integration platform
US7818495B2 (en) 2007-09-28 2010-10-19 Hitachi, Ltd. Storage device and deduplication method
US7818287B2 (en) 2004-11-12 2010-10-19 Nec Corporation Storage management system and method and program
US7818531B2 (en) 2004-11-05 2010-10-19 Data Robotics, Inc. Storage system condition indicator and method
US20100281081A1 (en) 2009-04-29 2010-11-04 Netapp, Inc. Predicting space reclamation in deduplicated datasets
US7830889B1 (en) 2003-02-06 2010-11-09 Juniper Networks, Inc. Systems for scheduling the transmission of data in a network device
US7831793B2 (en) 2006-03-01 2010-11-09 Quantum Corporation Data storage system including unique block pool manager and applications in tiered storage
US7831707B2 (en) 2006-08-02 2010-11-09 Scenera Technologies, Llc Methods, systems, and computer program products for managing electronic subscriptions
US7836161B2 (en) 2002-10-18 2010-11-16 International Business Machines Corporation Simultaneous data backup in a computer system
US7853750B2 (en) 2007-01-30 2010-12-14 Netapp, Inc. Method and an apparatus to store data patterns
US7856414B2 (en) 2001-03-29 2010-12-21 Christopher Zee Assured archival and retrieval system for digital intellectual property
US20100332401A1 (en) 2009-06-30 2010-12-30 Anand Prahlad Performing data storage operations with a cloud storage environment, including automatically selecting among multiple cloud storage sites
US7865470B2 (en) 2004-09-09 2011-01-04 Microsoft Corporation Method, system, and apparatus for translating logical information representative of physical data in a data protection system
US7865678B2 (en) 2004-07-07 2011-01-04 Hitachi, Ltd. Remote copy system maintaining consistency
US7870105B2 (en) 2007-11-20 2011-01-11 Hitachi, Ltd. Methods and apparatus for deduplication in storage system
US7870486B2 (en) 2007-01-26 2011-01-11 Kabushiki Kaisha Toshiba System and method for simultaneously commencing output of disparately encoded electronic documents
US7873599B2 (en) 2006-07-27 2011-01-18 Hitachi, Ltd. Backup control apparatus and method eliminating duplication of information resources
US7882077B2 (en) 2006-10-17 2011-02-01 Commvault Systems, Inc. Method and system for offline indexing of content and classifying stored data
US7899990B2 (en) 2005-11-15 2011-03-01 Oracle America, Inc. Power conservation via DRAM access
US7921077B2 (en) 2006-06-29 2011-04-05 Netapp, Inc. System and method for managing data deduplication of storage systems utilizing persistent consistency point images
US20110125711A1 (en) 2009-11-23 2011-05-26 David Norman Richard Meisenheimer Generating device specific thumbnails
US7962452B2 (en) 2007-12-28 2011-06-14 International Business Machines Corporation Data deduplication by separating data from meta data
US8028106B2 (en) 2007-07-06 2011-09-27 Proster Systems, Inc. Hardware acceleration of commonality factoring with removable media
US8041907B1 (en) 2008-06-30 2011-10-18 Symantec Operating Corporation Method and system for efficient space management for single-instance-storage volumes
US8051367B2 (en) 2007-09-26 2011-11-01 Hitachi, Ltd. Storage sub-system and method for controlling the same
US8054765B2 (en) 2005-10-21 2011-11-08 Emc Corporation Systems and methods for providing variable protection
US8078603B1 (en) 2006-10-05 2011-12-13 Blinkx Uk Ltd Various methods and apparatuses for moving thumbnails
US8086799B2 (en) 2008-08-12 2011-12-27 Netapp, Inc. Scalable deduplication of stored data
US8095756B1 (en) 2009-04-28 2012-01-10 Netapp, Inc. System and method for coordinating deduplication operations and backup operations of a storage volume
US8108429B2 (en) 2004-05-07 2012-01-31 Quest Software, Inc. System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
US8112357B2 (en) 2006-11-07 2012-02-07 Federal Reserve Bank Of Atlanta Systems and methods for preventing duplicative electronic check processing
US8131687B2 (en) 2008-11-13 2012-03-06 International Business Machines Corporation File system with internal deduplication and management of data blocks
US8156092B2 (en) 2008-01-29 2012-04-10 Hewett Jeffrey R Document de-duplication and modification detection
US8166263B2 (en) 2008-07-03 2012-04-24 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US8165221B2 (en) 2006-04-28 2012-04-24 Netapp, Inc. System and method for sampling based elimination of duplicate data
US20120102286A1 (en) 2010-10-26 2012-04-26 Holt Keith W Methods and structure for online migration of data in storage systems comprising a plurality of storage devices
US8170994B2 (en) 2007-09-28 2012-05-01 Symantec Corporation Techniques for virtual archiving
US8190835B1 (en) 2007-12-31 2012-05-29 Emc Corporation Global de-duplication in shared architectures
US8190823B2 (en) 2008-09-18 2012-05-29 Lenovo (Singapore) Pte. Ltd. Apparatus, system and method for storage cache deduplication
US20120150818A1 (en) 2010-12-14 2012-06-14 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US20120159098A1 (en) 2010-12-17 2012-06-21 Microsoft Corporation Garbage collection and hotspots relief for a data deduplication chunk store
US8213540B1 (en) 2007-04-27 2012-07-03 Marvell International Ltd. System and method of transmit beam selection
US8219524B2 (en) 2008-06-24 2012-07-10 Commvault Systems, Inc. Application-aware and remote single instance data management
US8234444B2 (en) 2008-03-11 2012-07-31 International Business Machines Corporation Apparatus and method to select a deduplication protocol for a data storage library
US8239348B1 (en) 2008-08-14 2012-08-07 Symantec Corporation Method and apparatus for automatically archiving data items from backup storage
US8244914B1 (en) 2009-07-31 2012-08-14 Symantec Corporation Systems and methods for restoring email databases
US20120233417A1 (en) 2011-03-11 2012-09-13 Microsoft Corporation Backup and restore strategies for data deduplication
US8271992B2 (en) 2007-08-29 2012-09-18 Nirvanix, Inc. Load based file allocation among a plurality of storage devices
US8295875B2 (en) 2006-11-28 2012-10-23 Fujitsu Toshiba Mobile Communications Limited Apparatus and method for mobile communication by using non-volatile memory device
US8296301B2 (en) 2008-01-30 2012-10-23 Commvault Systems, Inc. Systems and methods for probabilistic data classification
US8315984B2 (en) 2007-05-22 2012-11-20 Netapp, Inc. System and method for on-the-fly elimination of redundant data
US20120311581A1 (en) 2011-05-31 2012-12-06 International Business Machines Corporation Adaptive parallel data processing
US8346730B2 (en) 2008-04-25 2013-01-01 Netapp. Inc. Deduplication of data on disk devices based on a threshold number of sequential blocks
US8352422B2 (en) 2010-03-30 2013-01-08 Commvault Systems, Inc. Data restore systems and methods in a replication environment
US8364652B2 (en) 2010-09-30 2013-01-29 Commvault Systems, Inc. Content aligned block-based deduplication
US8375008B1 (en) 2003-01-17 2013-02-12 Robert Gomes Method and system for enterprise-wide retention of digital or electronic data
US20130041872A1 (en) 2011-08-12 2013-02-14 Alexander AIZMAN Cloud storage system with distributed metadata
US8386436B2 (en) 2008-09-30 2013-02-26 Rainstor Limited System and method for data storage
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8412677B2 (en) 2008-11-26 2013-04-02 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US8412682B2 (en) 2006-06-29 2013-04-02 Netapp, Inc. System and method for retrieving and using block fingerprints for data deduplication
US20130086007A1 (en) 2011-09-30 2013-04-04 Symantec Corporation System and method for filesystem deduplication using variable length sharing
US20130117305A1 (en) 2010-07-21 2013-05-09 Sqream Technologies Ltd System and Method for the Parallel Execution of Database Queries Over CPUs and Multi Core Processors
US8484162B2 (en) 2008-06-24 2013-07-09 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US8504515B2 (en) 2010-03-30 2013-08-06 Commvault Systems, Inc. Stubbing systems and methods in a data replication environment
US20130218350A1 (en) 2012-02-21 2013-08-22 Andrew Manzo System and Method for Real-Time Controls of Energy Consuming Devices Including Tiered Architecture
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US20130262801A1 (en) 2011-09-30 2013-10-03 Commvault Systems, Inc. Information management of virtual machines having mapped storage devices
US20130262394A1 (en) 2012-03-30 2013-10-03 Commvault Systems, Inc. Search filtered file system using secondary storage
US8572340B2 (en) 2010-09-30 2013-10-29 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US20130290598A1 (en) 2012-04-25 2013-10-31 International Business Machines Corporation Reducing Power Consumption by Migration of Data within a Tiered Storage System
US8578120B2 (en) 2009-05-22 2013-11-05 Commvault Systems, Inc. Block-level single instancing
US20130339310A1 (en) 2012-06-13 2013-12-19 Commvault Systems, Inc. Restore using a client side signature repository in a networked storage system
US8620845B2 (en) 2008-09-24 2013-12-31 Timothy John Stoakes Identifying application metadata in a backup stream
US20140006382A1 (en) 2012-06-29 2014-01-02 International Business Machines Corporation Predicate pushdown with late materialization in database query processing
US8626723B2 (en) 2008-10-14 2014-01-07 Vmware, Inc. Storage-network de-duplication
US20140012814A1 (en) 2012-07-06 2014-01-09 Box, Inc. System and method for performing shard migration to support functions of a cloud-based service
US8712974B2 (en) 2008-12-22 2014-04-29 Google Inc. Asynchronous distributed de-duplication for replicated content addressable storage clusters
US20140129961A1 (en) 2012-11-07 2014-05-08 Sergey Mikhailovich Zubarev Tool for managing user task information
US8725698B2 (en) 2010-03-30 2014-05-13 Commvault Systems, Inc. Stub file prioritization in a data replication system
US20140181079A1 (en) 2012-12-20 2014-06-26 Teradata Corporation Adaptive optimization of iterative or recursive query execution by database systems
US8769185B2 (en) 2007-10-23 2014-07-01 Keicy Chung Computer storage device having separate read-only space and read-write space, removable media component, system management interface, and network interface
US20140188532A1 (en) 2012-11-13 2014-07-03 Nec Laboratories America, Inc. Multitenant Database Placement with a Cost Based Query Scheduler
US8782368B2 (en) 2007-10-25 2014-07-15 Hewlett-Packard Development Company, L.P. Storing chunks in containers
US20140201485A1 (en) 2013-01-14 2014-07-17 Commvault Systems, Inc. Pst file archiving
US20140310232A1 (en) 2013-04-11 2014-10-16 Hasso-Plattner-Institut für Softwaresystemtechnik GmbH Aggregate query-caching in databases architectures with a differential buffer and a main store
US8880797B2 (en) 2007-09-05 2014-11-04 Emc Corporation De-duplication in a virtualized server environment
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
US8935492B2 (en) 2010-09-30 2015-01-13 Commvault Systems, Inc. Archiving data objects using secondary copies
US8965852B2 (en) 2009-11-24 2015-02-24 Dell Products L.P. Methods and apparatus for network efficient deduplication
US8997020B2 (en) 2001-07-13 2015-03-31 Universal Electronics Inc. System and methods for interacting with a control environment
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US9020890B2 (en) 2012-03-30 2015-04-28 Commvault Systems, Inc. Smart archiving and data previewing for mobile devices
US9026498B2 (en) 2012-08-13 2015-05-05 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US20150178277A1 (en) 2013-12-23 2015-06-25 Tata Consultancy Services Limited System and method predicting effect of cache on query elapsed response time during application development stage
US9069799B2 (en) 2012-12-27 2015-06-30 Commvault Systems, Inc. Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system
US20150212889A1 (en) 2014-01-27 2015-07-30 Commvault Systems, Inc. Techniques for serving archived electronic mail
US9213540B1 (en) 2015-05-05 2015-12-15 Archive Solutions Providers Automated workflow management system for application and data retirement
US20150363270A1 (en) 2014-06-11 2015-12-17 Commvault Systems, Inc. Conveying value of implementing an integrated data management and protection system
US9223597B2 (en) 2012-12-21 2015-12-29 Commvault Systems, Inc. Archiving virtual machines in a data storage system
US20160019224A1 (en) 2014-07-18 2016-01-21 Commvault Systems, Inc. File system content archiving based on third-party application archiving rules and metadata
US9276871B1 (en) 2014-03-20 2016-03-01 Cisco Technology, Inc. LISP stretched subnet mode for data center migrations
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US9286110B2 (en) 2013-01-14 2016-03-15 Commvault Systems, Inc. Seamless virtual machine recall in a data storage system
US9372479B1 (en) 2012-02-21 2016-06-21 Omniboard, Inc. System and method for a database layer for managing a set of energy consuming devices
US20160179435A1 (en) 2014-12-19 2016-06-23 Oracle International Corporation Systems and methods for shadow migration progress estimation
US20160210064A1 (en) 2015-01-21 2016-07-21 Commvault Systems, Inc. Database protection using block-level mapping
US20160253254A1 (en) 2015-02-27 2016-09-01 Commvault Systems, Inc. Diagnosing errors in data storage and archiving in a cloud or networking environment
US20160299818A1 (en) 2015-04-09 2016-10-13 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US20160342633A1 (en) 2015-05-20 2016-11-24 Commvault Systems, Inc. Predicting scale of data migration between production and archive storage systems, such as for enterprise customers having large and/or numerous files
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US20170083408A1 (en) 2012-12-28 2017-03-23 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US9633033B2 (en) 2013-01-11 2017-04-25 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US9646166B2 (en) 2013-08-05 2017-05-09 International Business Machines Corporation Masking query data access pattern in encrypted data
US9781000B1 (en) 2014-12-22 2017-10-03 EMC IP Holding Company LLC Storage mobility using locator-identifier separation protocol
US9848046B2 (en) 2014-11-13 2017-12-19 Commvault Systems, Inc. Archiving applications in information management systems
US9928144B2 (en) 2015-03-30 2018-03-27 Commvault Systems, Inc. Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage
US9939981B2 (en) 2013-09-12 2018-04-10 Commvault Systems, Inc. File manager integration with virtualization in an information management system with an enhanced storage manager, including user control and storage management of virtual machines
US20180137139A1 (en) 2016-11-16 2018-05-17 Commvault Systems, Inc. Dynamically configuring a proxy server using containerization for concurrent and/or overlapping backup, restore, and/or test operations
US20180288150A1 (en) 2017-03-28 2018-10-04 Commvault Systems, Inc. Archiving mail servers via a simple mail transfer protocol (smtp) server
US20190108341A1 (en) 2017-09-14 2019-04-11 Commvault Systems, Inc. Ransomware detection and data pruning management
US10310953B2 (en) 2015-12-30 2019-06-04 Commvault Systems, Inc. System for redirecting requests after a secondary storage computing device failure
US10338823B2 (en) 2012-12-21 2019-07-02 Commvault Systems, Inc. Archiving using data obtained during backup of primary storage
US10481824B2 (en) 2015-05-26 2019-11-19 Commvault Systems, Inc. Replication using deduplicated secondary copy data
US10742735B2 (en) 2017-12-12 2020-08-11 Commvault Systems, Inc. Enhanced network attached storage (NAS) services interfacing to cloud storage

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6993162B2 (en) 2001-06-15 2006-01-31 Eastman Kodak Company Method for authenticating animation
US8055672B2 (en) * 2004-06-10 2011-11-08 International Business Machines Corporation Dynamic graphical database query and data mining interface
US20200327017A1 (en) 2019-04-10 2020-10-15 Commvault Systems, Inc. Restore using deduplicated secondary copy data
US11463264B2 (en) 2019-05-08 2022-10-04 Commvault Systems, Inc. Use of data block signatures for monitoring in an information management system

Patent Citations (505)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4686620A (en) 1984-07-26 1987-08-11 American Telephone And Telegraph Company, At&T Bell Laboratories Database backup method
US4713755A (en) 1985-06-28 1987-12-15 Hewlett-Packard Company Cache memory consistency control with explicit software instructions
EP0259912A1 (en) 1986-09-12 1988-03-16 Hewlett-Packard Limited File backup facility for a community of personal computers
US5193154A (en) 1987-07-10 1993-03-09 Hitachi, Ltd. Buffered peripheral system and method for backing up and retrieving data to and from backup memory device
US5005122A (en) 1987-09-08 1991-04-02 Digital Equipment Corporation Arrangement with cooperating management server node and network service node
US5226157A (en) 1988-03-11 1993-07-06 Hitachi, Ltd. Backup control method and system in data processing system using identifiers for controlling block data transfer
US4995035A (en) 1988-10-31 1991-02-19 International Business Machines Corporation Centralized management in a computer network
US5093912A (en) 1989-06-26 1992-03-03 International Business Machines Corporation Dynamic resource pool expansion and contraction in multiprocessing environments
EP0405926A2 (en) 1989-06-30 1991-01-02 Digital Equipment Corporation Method and apparatus for managing a shadow set of storage media
US5133065A (en) 1989-07-27 1992-07-21 Personal Computer Peripherals Corporation Backup computer program for networks
US5321816A (en) 1989-10-10 1994-06-14 Unisys Corporation Local-remote apparatus with specialized image storage modules
US5504873A (en) 1989-11-01 1996-04-02 E-Systems, Inc. Mass data storage and retrieval system
US5276867A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data storage system with improved data migration
US5276860A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data processor with improved backup storage
EP0467546A2 (en) 1990-07-18 1992-01-22 International Computers Limited Distributed data processing systems
US5239647A (en) 1990-09-07 1993-08-24 International Business Machines Corporation Data storage hierarchy with shared storage level
US5544347A (en) 1990-09-24 1996-08-06 Emc Corporation Data storage system controlled remote data mirroring with respectively maintained data indices
US5212772A (en) 1991-02-11 1993-05-18 Gigatrend Incorporated System for storing data in backup tape device
US5287500A (en) 1991-06-03 1994-02-15 Digital Equipment Corporation System for allocating storage spaces based upon required and optional service attributes having assigned piorities
US5333315A (en) 1991-06-27 1994-07-26 Digital Equipment Corporation System of device independent file directories using a tag between the directories and file descriptors that migrate with the files
US5347653A (en) 1991-06-28 1994-09-13 Digital Equipment Corporation System for reconstructing prior versions of indexes using records indicating changes between successive versions of the indexes
US5410700A (en) 1991-09-04 1995-04-25 International Business Machines Corporation Computer system which supports asynchronous commitment of data
US5241670A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated backup copy ordering in a time zero backup copy session
US5241668A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated termination and resumption in a time zero backup copy process
USRE37601E1 (en) 1992-04-20 2002-03-19 International Business Machines Corporation Method and system for incremental time zero backup copying of data
US5751997A (en) 1993-01-21 1998-05-12 Apple Computer, Inc. Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked computer environment
US5764972A (en) 1993-02-01 1998-06-09 Lsc, Inc. Archiving file system for data servers in a distributed network environment
US5794229A (en) 1993-04-16 1998-08-11 Sybase, Inc. Database system with methodology for storing a database table by vertically partitioning all columns of the table
US5437012A (en) 1993-04-19 1995-07-25 Canon Information Systems, Inc. System for updating directory information and data on write once media such as an optical memory card
US5742792A (en) 1993-04-23 1998-04-21 Emc Corporation Remote data mirroring
US5448724A (en) 1993-07-02 1995-09-05 Fujitsu Limited Data processing system having double supervising functions
US5606686A (en) 1993-10-22 1997-02-25 Hitachi, Ltd. Access control method for a shared main memory in a multiprocessor based upon a directory held at a storage location of data in the memory after reading data to a processor
US5544345A (en) 1993-11-08 1996-08-06 International Business Machines Corporation Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage
WO1995013580A1 (en) 1993-11-09 1995-05-18 Arcada Software Data backup and restore system for a computer network
US5495607A (en) 1993-11-15 1996-02-27 Conner Peripherals, Inc. Network management system having virtual catalog overview of files distributively stored across network domain
US5491810A (en) 1994-03-01 1996-02-13 International Business Machines Corporation Method and system for automated data storage system space allocation utilizing prioritized data set parameters
US5673381A (en) 1994-05-27 1997-09-30 Cheyenne Software International Sales Corp. System and parallel streaming and data stripping to back-up a network
US5638509A (en) 1994-06-10 1997-06-10 Exabyte Corporation Data storage and protection system
US5634052A (en) 1994-10-24 1997-05-27 International Business Machines Corporation System for reducing storage requirements and transmission loads in a backup subsystem in client-server environment by transmitting only delta files from client to server
US5813017A (en) 1994-10-24 1998-09-22 International Business Machines Corporation System and method for reducing storage requirement in backup subsystems utilizing segmented compression and differencing
US5806057A (en) 1994-11-04 1998-09-08 Optima Direct, Inc. System for managing database of communication recipients
US5628004A (en) 1994-11-04 1997-05-06 Optima Direct, Inc. System for managing database of communication of recipients
US5990810A (en) 1995-02-17 1999-11-23 Williams; Ross Neil Method for partitioning a block of data into subblocks and for storing and communcating such subblocks
US5604862A (en) 1995-03-14 1997-02-18 Network Integrity, Inc. Continuously-snapshotted protection of computer files
US5559957A (en) 1995-05-31 1996-09-24 Lucent Technologies Inc. File system for a data storage device having a power fail recovery mechanism for write/replace operations
US5699361A (en) 1995-07-18 1997-12-16 Industrial Technology Research Institute Multimedia channel formulation mechanism
US5813009A (en) 1995-07-28 1998-09-22 Univirtual Corp. Computer based records management system method
US5619644A (en) 1995-09-18 1997-04-08 International Business Machines Corporation Software directed microcode state save for distributed storage controller
US5974563A (en) 1995-10-16 1999-10-26 Network Specialists, Inc. Real time backup system
US5778395A (en) 1995-10-23 1998-07-07 Stac, Inc. System for backing up files from disk volumes on multiple nodes of a computer network
US20020107877A1 (en) 1995-10-23 2002-08-08 Douglas L. Whiting System for backing up files from disk volumes on multiple nodes of a computer network
EP0774715A1 (en) 1995-10-23 1997-05-21 Stac Electronics System for backing up files from disk volumes on multiple nodes of a computer network
US5729743A (en) 1995-11-17 1998-03-17 Deltatech Research, Inc. Computer apparatus and method for merging system deltas
US5761677A (en) 1996-01-03 1998-06-02 Sun Microsystems, Inc. Computer system method and apparatus providing for various versions of a file without requiring data copy or log operations
US5862325A (en) * 1996-02-29 1999-01-19 Intermind Corporation Computer-based communication system and method using metadata defining a control structure
US6148412A (en) 1996-05-23 2000-11-14 International Business Machines Corporation Availability and recovery of files using copy storage pools
EP0809184A1 (en) 1996-05-23 1997-11-26 International Business Machines Corporation Availability and recovery of files using copy storage pools
US5901327A (en) 1996-05-28 1999-05-04 Emc Corporation Bundling of write data from channel commands in a command chain for transmission over a data link between data storage systems for remote data mirroring
US5812398A (en) 1996-06-10 1998-09-22 Sun Microsystems, Inc. Method and system for escrowed backup of hotelled world wide web sites
US5940833A (en) 1996-07-12 1999-08-17 Microsoft Corporation Compressing sets of integers
US5813008A (en) 1996-07-12 1998-09-22 Microsoft Corporation Single instance storage of information
US5758359A (en) 1996-10-24 1998-05-26 Digital Equipment Corporation Method and apparatus for performing retroactive backups in a computer system
US5875478A (en) 1996-12-03 1999-02-23 Emc Corporation Computer backup using a file system, network, disk, tape and remote archiving repository media system
US6131095A (en) 1996-12-11 2000-10-10 Hewlett-Packard Company Method of accessing a target entity over a communications network
US5822780A (en) 1996-12-31 1998-10-13 Emc Corporation Method and apparatus for hierarchical storage management for data base management systems
US6328766B1 (en) 1997-01-23 2001-12-11 Overland Data, Inc. Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host
US6658526B2 (en) 1997-03-12 2003-12-02 Storage Technology Corporation Network attached virtual data storage subsystem
US5924102A (en) 1997-05-07 1999-07-13 International Business Machines Corporation System and method for managing critical files
US6094416A (en) 1997-05-09 2000-07-25 I/O Control Corporation Multi-tier architecture for control network
US5887134A (en) 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
US6311252B1 (en) 1997-06-30 2001-10-30 Emc Corporation Method and apparatus for moving data between storage levels of a hierarchically arranged data storage system
EP0899662A1 (en) 1997-08-29 1999-03-03 Hewlett-Packard Company Backup and restore system for a computer network
WO1999012098A1 (en) 1997-08-29 1999-03-11 Hewlett-Packard Company Data backup and recovery systems
US5950205A (en) 1997-09-25 1999-09-07 Cisco Technology, Inc. Data transmission over the internet using a cache memory file system
US6275953B1 (en) 1997-09-26 2001-08-14 Emc Corporation Recovery from failure of a data processor in a network server
US6173291B1 (en) 1997-09-26 2001-01-09 Powerquest Corporation Method and apparatus for recovering data from damaged or corrupted file storage media
US6125369A (en) 1997-10-02 2000-09-26 Microsoft Corporation Continuous object sychronization between object stores on different computers
US6052735A (en) 1997-10-24 2000-04-18 Microsoft Corporation Electronic mail object synchronization between a desktop computer and mobile device
US6021415A (en) 1997-10-29 2000-02-01 International Business Machines Corporation Storage management system with file aggregation and space reclamation within aggregated files
US6418478B1 (en) 1997-10-30 2002-07-09 Commvault Systems, Inc. Pipelined high speed data transfer mechanism
US6301592B1 (en) 1997-11-05 2001-10-09 Hitachi, Ltd. Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program is recorded
US6131190A (en) 1997-12-18 2000-10-10 Sidwell; Leland P. System for modifying JCL parameters to optimize data storage allocations
US6076148A (en) 1997-12-26 2000-06-13 Emc Corporation Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information stored on mass storage subsystem
US6154787A (en) 1998-01-21 2000-11-28 Unisys Corporation Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed
US20060129576A1 (en) 1998-01-23 2006-06-15 Emc Corporation Access to content addressable data over a network
US6260069B1 (en) 1998-02-10 2001-07-10 International Business Machines Corporation Direct data retrieval in a distributed computing system
US6901493B1 (en) 1998-02-24 2005-05-31 Adaptec, Inc. Method for protecting data of a computer system
US6330570B1 (en) 1998-03-02 2001-12-11 Hewlett-Packard Company Data backup system
US6026414A (en) 1998-03-05 2000-02-15 International Business Machines Corporation System including a proxy client to backup files in a distributed computing environment
US7277941B2 (en) 1998-03-11 2007-10-02 Commvault Systems, Inc. System and method for providing encryption in a storage network by storing a secured encryption key with encrypted archive data in an archive storage device
US7739381B2 (en) 1998-03-11 2010-06-15 Commvault Systems, Inc. System and method for providing encryption in storage operations in a storage network, such as for use by application service providers that provide data storage services
US6161111A (en) 1998-03-31 2000-12-12 Emc Corporation System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map
US6167402A (en) 1998-04-27 2000-12-26 Sun Microsystems, Inc. High performance message store
US6073133A (en) 1998-05-15 2000-06-06 Micron Electronics Inc. Electronic mail attachment verifier
US7035943B2 (en) 1998-05-29 2006-04-25 Yahoo! Inc. Web server content replication
US6421711B1 (en) 1998-06-29 2002-07-16 Emc Corporation Virtual ports for data transferring of a data storage system
US6269431B1 (en) 1998-08-13 2001-07-31 Emc Corporation Virtual storage and block level direct access of secondary storage for recovery of backup data
EP0981090A1 (en) 1998-08-17 2000-02-23 Connected Place Limited A method of producing a checkpoint which describes a base file and a method of generating a difference file defining differences between an updated file and a base file
US7111173B1 (en) 1998-09-01 2006-09-19 Tecsec, Inc. Encryption process including a biometric unit
US6609157B2 (en) 1998-09-22 2003-08-19 Microsoft Corporation Method and apparatus for bundling messages at the expiration of a time-limit
US6708195B1 (en) 1998-10-02 2004-03-16 International Business Machines Corporation Composite locking of objects in a database
US6324544B1 (en) 1998-10-21 2001-11-27 Microsoft Corporation File object synchronization between a desktop computer and a mobile device
US6487561B1 (en) 1998-12-31 2002-11-26 Emc Corporation Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size
US6212512B1 (en) 1999-01-06 2001-04-03 Hewlett-Packard Company Integration of a database into file management software for protecting, tracking and retrieving data
US20060047978A1 (en) 1999-02-17 2006-03-02 Sony Corporation Information processing apparatus and method, and program storage medium
US6363400B1 (en) 1999-02-22 2002-03-26 Starbase Corp. Name space extension for an operating system
US6356915B1 (en) 1999-02-22 2002-03-12 Starbase Corp. Installable file system having virtual file system drive, virtual device driver, and virtual disks
US6609183B2 (en) 1999-02-23 2003-08-19 Legato Systems, Inc. Method and system for mirroring and archiving mass storage
US7107418B2 (en) 1999-02-23 2006-09-12 Emc Corporation Method and system for mirroring and archiving mass storage
US6324581B1 (en) 1999-03-03 2001-11-27 Emc Corporation File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems
US6389432B1 (en) 1999-04-05 2002-05-14 Auspex Systems, Inc. Intelligent virtual volume access
US6519679B2 (en) 1999-06-11 2003-02-11 Dell Usa, L.P. Policy based storage configuration
US6959368B1 (en) 1999-06-29 2005-10-25 Emc Corporation Method and apparatus for duplicating computer backup data
US6609187B1 (en) 1999-07-06 2003-08-19 Dell Products L.P. Method and apparatus for supporting resizing of file system partitions
US7035880B1 (en) 1999-07-14 2006-04-25 Commvault Systems, Inc. Modular backup and retrieval system used in conjunction with a storage area network
US6538669B1 (en) 1999-07-15 2003-03-25 Dell Products L.P. Graphical user interface for configuration of a storage system
US7395282B1 (en) 1999-07-15 2008-07-01 Commvault Systems, Inc. Hierarchical backup and retrieval system
US6477544B1 (en) 1999-07-16 2002-11-05 Microsoft Corporation Single instance store for file systems
US6513051B1 (en) 1999-07-16 2003-01-28 Microsoft Corporation Method and system for backing up and restoring files stored in a single instance store
US6757794B2 (en) 1999-08-20 2004-06-29 Microsoft Corporation Buffering data in a hierarchical data storage environment
US6343324B1 (en) 1999-09-13 2002-01-29 International Business Machines Corporation Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access to the devices
US6564228B1 (en) 2000-01-14 2003-05-13 Sun Microsystems, Inc. Method of enabling heterogeneous platforms to utilize a universal file system in a storage area network
US20060056623A1 (en) 2000-01-31 2006-03-16 Vdg, Inc. Block encryption method and schemes for data confidentiality and integrity protection
US20050131961A1 (en) 2000-02-18 2005-06-16 Margolus Norman H. Data repository and method for promoting network storage of data
US20010037323A1 (en) 2000-02-18 2001-11-01 Moulton Gregory Hagan Hash file system and method for use in a commonality factoring system
US20040148306A1 (en) 2000-02-18 2004-07-29 Moulton Gregory Hagan Hash file system and method for use in a commonality factoring system
US6704730B2 (en) 2000-02-18 2004-03-09 Avamar Technologies, Inc. Hash file system and method for use in a commonality factoring system
US7117246B2 (en) 2000-02-22 2006-10-03 Sendmail, Inc. Electronic mail system with methodology providing distributed message store
US7143108B1 (en) 2000-04-06 2006-11-28 International Business Machines Corporation Apparatus and method for deletion of objects from an object-relational system in a customizable and database independent manner
US20030074600A1 (en) 2000-04-12 2003-04-17 Masaharu Tamatsu Data backup/recovery system
US20020055972A1 (en) 2000-05-08 2002-05-09 Weinman Joseph Bernard Dynamic content distribution and data continuity architecture
US6356801B1 (en) 2000-05-19 2002-03-12 International Business Machines Corporation High availability work queuing in an automated data storage library
US6675177B1 (en) 2000-06-21 2004-01-06 Teradactyl, Llc Method and system for backing up digital data
US6330642B1 (en) 2000-06-29 2001-12-11 Bull Hn Informatin Systems Inc. Three interconnected raid disk controller data processing system architecture
US6928459B1 (en) 2000-07-18 2005-08-09 International Business Machines Corporation Plurality of file systems using weighted allocation to allocate space on one or more storage devices
US7310655B2 (en) 2000-07-31 2007-12-18 Microsoft Corporation Method and system for concurrent garbage collection
US6757699B2 (en) 2000-10-06 2004-06-29 Franciscan University Of Steubenville Method and system for fragmenting and reconstituting data
US6973553B1 (en) 2000-10-20 2005-12-06 International Business Machines Corporation Method and apparatus for using extended disk sector formatting to assist in backup and hierarchical storage management
US20050286466A1 (en) 2000-11-03 2005-12-29 Tagg James P System for providing mobile VoIP
US6810398B2 (en) 2000-11-06 2004-10-26 Avamar Technologies, Inc. System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences
US20020099806A1 (en) 2000-11-30 2002-07-25 Phillip Balsamo Processing node for eliminating duplicate network usage data
US20020065892A1 (en) 2000-11-30 2002-05-30 Malik Dale W. Method and apparatus for minimizing storage of common attachment files in an e-mail communications server
US7444382B2 (en) 2000-11-30 2008-10-28 At&T Intellectual Property I, L.P. Method and apparatus for minimizing storage of common attachment files in an e-mail communications server
US6868417B2 (en) 2000-12-18 2005-03-15 Spinnaker Networks, Inc. Mechanism for handling file level and block level remote file accesses using the same server
US6745304B2 (en) 2001-02-15 2004-06-01 Alcatel Method and device for storing computer data with back-up operations
US7035876B2 (en) 2001-03-19 2006-04-25 Attenex Corporation System and method for evaluating a structured message store for message redundancy
US6889297B2 (en) 2001-03-23 2005-05-03 Sun Microsystems, Inc. Methods and systems for eliminating data redundancies
US7856414B2 (en) 2001-03-29 2010-12-21 Christopher Zee Assured archival and retrieval system for digital intellectual property
US6976039B2 (en) 2001-05-25 2005-12-13 International Business Machines Corporation Method and system for processing backup data associated with application, querying metadata files describing files accessed by the application
US7290102B2 (en) 2001-06-01 2007-10-30 Hewlett-Packard Development Company, L.P. Point in time storage copy
US7444387B2 (en) 2001-06-06 2008-10-28 Microsoft Corporation Locating potentially identical objects across multiple computers based on stochastic partitioning of workload
US7487245B2 (en) 2001-06-06 2009-02-03 Microsoft Corporation Locating potentially identical objects across multiple computers based on stochastic partitioning of workload
US20050203864A1 (en) 2001-06-27 2005-09-15 Ontrak Data International, Inc. System and method for data management
US20030004922A1 (en) 2001-06-27 2003-01-02 Ontrack Data International, Inc. System and method for data management
US8997020B2 (en) 2001-07-13 2015-03-31 Universal Electronics Inc. System and methods for interacting with a control environment
US6912645B2 (en) 2001-07-19 2005-06-28 Lucent Technologies Inc. Method and apparatus for archival data storage
US7516208B1 (en) 2001-07-20 2009-04-07 International Business Machines Corporation Event database management method and system for network event reporting system
US7685126B2 (en) 2001-08-03 2010-03-23 Isilon Systems, Inc. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
WO2003027891A1 (en) 2001-09-28 2003-04-03 Commvault Systems, Inc. System and method for archiving objects in an information store
US7107298B2 (en) 2001-09-28 2006-09-12 Commvault Systems, Inc. System and method for archiving objects in an information store
US8055627B2 (en) * 2001-09-28 2011-11-08 Commvault Systems, Inc. System and method for archiving objects in an information store
US20030167318A1 (en) 2001-10-22 2003-09-04 Apple Computer, Inc. Intelligent synchronization of media player with host computer
US7092956B2 (en) 2001-11-02 2006-08-15 General Electric Capital Corporation Deduplication system
US20070022145A1 (en) 2001-11-23 2007-01-25 Srinivas Kavuri Selective data replication system and method
US8161003B2 (en) 2001-11-23 2012-04-17 Commvault Systems, Inc. Selective data replication system and method
US7496604B2 (en) 2001-12-03 2009-02-24 Aol Llc Reducing duplication of files on a network
US20030110190A1 (en) 2001-12-10 2003-06-12 Hitachi, Ltd. Method and system for file space management
US20030172368A1 (en) 2001-12-26 2003-09-11 Elizabeth Alumbaugh System and method for autonomously generating heterogeneous data source interoperability bridges based on semantic modeling derived from self adapting ontology
US6839819B2 (en) 2001-12-28 2005-01-04 Storage Technology Corporation Data management appliance
US20030135480A1 (en) 2002-01-14 2003-07-17 Van Arsdale Robert S. System for updating a database
US6795903B2 (en) 2002-01-17 2004-09-21 Thomas Licensing S.A. System and method for searching for duplicate data
US7017113B2 (en) 2002-01-25 2006-03-21 The United States Of America As Represented By The Secretary Of The Air Force Method and apparatus for removing redundant information from digital documents
US7143091B2 (en) 2002-02-04 2006-11-28 Cataphorn, Inc. Method and apparatus for sociological data mining
US7383304B2 (en) 2002-02-12 2008-06-03 Canon Kabushiki Kaisha System, method, program and storage medium for processing electronic mail
US20030177149A1 (en) 2002-03-18 2003-09-18 Coombs David Lawrence System and method for data backup
US7778979B2 (en) 2002-03-26 2010-08-17 Nokia Siemens Networks Oy Method and apparatus for compressing log record information
US7139808B2 (en) 2002-04-30 2006-11-21 Intel Corporation Method and apparatus for bandwidth-efficient and storage-efficient backups
US20060089954A1 (en) 2002-05-13 2006-04-27 Anschutz Thomas A Scalable common access back-up architecture
US6862674B2 (en) 2002-06-06 2005-03-01 Sun Microsystems Methods and apparatus for performing a memory management technique
US20030236763A1 (en) 2002-06-25 2003-12-25 Alan Kilduff Electronic message filing system
US20040177319A1 (en) * 2002-07-16 2004-09-09 Horn Bruce L. Computer system for automatic organization, indexing and viewing of information from multiple sources
US6865655B1 (en) 2002-07-30 2005-03-08 Sun Microsystems, Inc. Methods and apparatus for backing up and restoring data portions stored in client computer systems
US6952758B2 (en) 2002-07-31 2005-10-04 International Business Machines Corporation Method and system for providing consistent data modification information to clients in a storage system
US7191290B1 (en) 2002-09-16 2007-03-13 Network Appliance, Inc. Apparatus and method for tandem operation in a storage network
US7287252B2 (en) 2002-09-27 2007-10-23 The United States Of America Represented By The Secretary Of The Navy Universal client and consumer
US7089395B2 (en) 2002-10-03 2006-08-08 Hewlett-Packard Development Company, L.P. Computer systems, virtual storage systems and virtual storage system operational methods
US7873806B2 (en) 2002-10-07 2011-01-18 Commvault Systems, Inc. Snapshot storage and management system with indexing and user interface
US7568080B2 (en) 2002-10-07 2009-07-28 Commvault Systems, Inc. Snapshot storage and management system with indexing and user interface
US20060230081A1 (en) 2002-10-10 2006-10-12 Craswell Ronald J Backing up a wireless computing device
US7836161B2 (en) 2002-10-18 2010-11-16 International Business Machines Corporation Simultaneous data backup in a computer system
US7370003B2 (en) 2002-11-08 2008-05-06 Amdocs Software Systems Ltd. Method and apparatus for implied attribution of responses to promotional contacts
US20040128287A1 (en) 2002-12-20 2004-07-01 International Business Machines Corporation Self tuning database retrieval optimization using regression functions
US20050066190A1 (en) 2003-01-02 2005-03-24 Cricket Technologies Llc Electronic archive filter and profiling apparatus, system, method, and electronically stored computer program product
US8375008B1 (en) 2003-01-17 2013-02-12 Robert Gomes Method and system for enterprise-wide retention of digital or electronic data
US7403942B1 (en) 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7830889B1 (en) 2003-02-06 2010-11-09 Juniper Networks, Inc. Systems for scheduling the transmission of data in a network device
US20040220975A1 (en) 2003-02-21 2004-11-04 Hypertrust Nv Additional hash functions in content-based addressing
US7478096B2 (en) 2003-02-26 2009-01-13 Burnside Acquisition, Llc History preservation in a computer storage system
US7389345B1 (en) 2003-03-26 2008-06-17 Sprint Communications Company L.P. Filtering approach for network system alarms
US20050033756A1 (en) 2003-04-03 2005-02-10 Rajiv Kottomtharayil System and method for dynamically sharing storage volumes in a computer network
US20040230817A1 (en) 2003-05-14 2004-11-18 Kenneth Ma Method and system for disaster recovery of data from a storage device
US20050138081A1 (en) 2003-05-14 2005-06-23 Alshab Melanie A. Method and system for reducing information latency in a business enterprise
US7089383B2 (en) 2003-06-06 2006-08-08 Hewlett-Packard Development Company, L.P. State machine and system for data redundancy
US20050108435A1 (en) 2003-08-21 2005-05-19 Nowacki Todd A. Method and system for electronic archival and retrieval of electronic communications
US20050060643A1 (en) 2003-08-25 2005-03-17 Miavia, Inc. Document similarity detection and classification system
US20050262193A1 (en) 2003-08-27 2005-11-24 Ascential Software Corporation Logging service for a services oriented architecture in a data integration platform
US7814142B2 (en) 2003-08-27 2010-10-12 International Business Machines Corporation User interface service for a services oriented architecture in a data integration platform
US7536440B2 (en) 2003-09-18 2009-05-19 Vulcan Portals Inc. Method and system for email synchronization for an electronic device
US7647462B2 (en) 2003-09-25 2010-01-12 International Business Machines Corporation Method, system, and program for data synchronization between a primary storage device and a secondary storage device by determining whether a first identifier and a second identifier match, where a unique identifier is associated with each portion of data
US7085904B2 (en) 2003-10-20 2006-08-01 Hitachi, Ltd. Storage system and method for backup
US20050097150A1 (en) 2003-11-03 2005-05-05 Mckeon Adrian J. Data aggregation
US7315923B2 (en) 2003-11-13 2008-01-01 Commvault Systems, Inc. System and method for combining data streams in pipelined storage operations in a storage network
US7440982B2 (en) 2003-11-13 2008-10-21 Commvault Systems, Inc. System and method for stored data archive verification
US7613748B2 (en) 2003-11-13 2009-11-03 Commvault Systems, Inc. Stored data reverification management system and method
US7272606B2 (en) 2003-11-26 2007-09-18 Veritas Operating Corporation System and method for detecting and storing file content access information within a file system
US7519726B2 (en) 2003-12-12 2009-04-14 International Business Machines Corporation Methods, apparatus and computer programs for enhanced access to resources within a network
US7200621B2 (en) 2003-12-17 2007-04-03 International Business Machines Corporation System to automate schema creation for table restore
US7325110B2 (en) 2003-12-19 2008-01-29 Hitachi, Ltd. Method for acquiring snapshot
US7103740B1 (en) 2003-12-31 2006-09-05 Veritas Operating Corporation Backup mechanism for a multi-class file system
US7747659B2 (en) 2004-01-05 2010-06-29 International Business Machines Corporation Garbage collector with eager read barrier
US7246272B2 (en) 2004-01-16 2007-07-17 International Business Machines Corporation Duplicate network address detection
US7685384B2 (en) 2004-02-06 2010-03-23 Globalscape, Inc. System and method for replicating files in a computer network
US20050195660A1 (en) 2004-02-11 2005-09-08 Kavuri Ravi K. Clustered hierarchical file services
US7200604B2 (en) 2004-02-17 2007-04-03 Hewlett-Packard Development Company, L.P. Data de-duplication
US20060174112A1 (en) 2004-02-27 2006-08-03 Bae Systems (Defence Systems) Limited Secure computer communication
US20080082736A1 (en) 2004-03-11 2008-04-03 Chow David Q Managing bad blocks in various flash memory cells for electronic data flash card
US20050203887A1 (en) 2004-03-12 2005-09-15 Solix Technologies, Inc. System and method for seamless access to multiple data sources
US20050210460A1 (en) 2004-03-22 2005-09-22 Microsoft Corporation Computing device with relatively limited storage space and operating/file system thereof
US7698699B2 (en) 2004-03-22 2010-04-13 Microsoft Corporation Computing device with relatively limited storage space and operating/file system thereof
US20080047935A1 (en) 2004-04-01 2008-02-28 Christian Schmidt Manufacturing and Use of Microperforated Substrates
US20050234823A1 (en) 2004-04-20 2005-10-20 Rainer Schimpf Systems and methods to prevent products from counterfeiting and surplus production also of tracking their way of distribution.
US7590639B1 (en) 2004-04-29 2009-09-15 Sap Ag System and method for ordering a database flush sequence at transaction commit
US7343459B2 (en) 2004-04-30 2008-03-11 Commvault Systems, Inc. Systems and methods for detecting & mitigating storage risks
US7676590B2 (en) 2004-05-03 2010-03-09 Microsoft Corporation Background transcoding
US8108429B2 (en) 2004-05-07 2012-01-31 Quest Software, Inc. System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
US20050254072A1 (en) 2004-05-12 2005-11-17 Canon Kabushiki Kaisha Image data processing method, client terminal, image processing program, image data management method and image management system
US7734581B2 (en) 2004-05-18 2010-06-08 Oracle International Corporation Vector reads for array updates
US20060010227A1 (en) 2004-06-01 2006-01-12 Rajeev Atluri Methods and apparatus for accessing data from a primary data storage system for secondary storage
US8055745B2 (en) 2004-06-01 2011-11-08 Inmage Systems, Inc. Methods and apparatus for accessing data from a primary data storage system for secondary storage
US20050283461A1 (en) 2004-06-02 2005-12-22 Jorg-Stefan Sell Method and apparatus for managing electronic messages
US7330997B1 (en) 2004-06-03 2008-02-12 Gary Odom Selective reciprocal backup
US20060005048A1 (en) 2004-07-02 2006-01-05 Hitachi Ltd. Method and apparatus for encrypted remote copy for secure data backup and restoration
US7383462B2 (en) 2004-07-02 2008-06-03 Hitachi, Ltd. Method and apparatus for encrypted remote copy for secure data backup and restoration
US7865678B2 (en) 2004-07-07 2011-01-04 Hitachi, Ltd. Remote copy system maintaining consistency
US7617297B2 (en) 2004-07-26 2009-11-10 International Business Machines Corporation Providing archiving of individual mail content while maintaining a single copy mail store
US7631120B2 (en) 2004-08-24 2009-12-08 Symantec Operating Corporation Methods and apparatus for optimally selecting a storage buffer for the storage of data
US20060047894A1 (en) 2004-08-30 2006-03-02 Fujitsu Limited Data recording apparatus, and data recording control method and program
US20060053305A1 (en) 2004-09-09 2006-03-09 Microsoft Corporation Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system
US7631194B2 (en) 2004-09-09 2009-12-08 Microsoft Corporation Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system
US7865470B2 (en) 2004-09-09 2011-01-04 Microsoft Corporation Method, system, and apparatus for translating logical information representative of physical data in a data protection system
US7786881B2 (en) 2004-09-17 2010-08-31 Koninklijke Philips Electronics N.V. Content status provision related to volatile memories
US20060095470A1 (en) 2004-11-04 2006-05-04 Cochran Robert A Managing a file in a network environment
WO2006052872A2 (en) 2004-11-05 2006-05-18 Commvault Systems, Inc. System and method to support single instance storage operations
US7818531B2 (en) 2004-11-05 2010-10-19 Data Robotics, Inc. Storage system condition indicator and method
US20060224846A1 (en) 2004-11-05 2006-10-05 Amarendran Arun P System and method to support single instance storage operations
US7490207B2 (en) 2004-11-08 2009-02-10 Commvault Systems, Inc. System and method for performing auxillary storage operations
US20060230244A1 (en) * 2004-11-08 2006-10-12 Amarendran Arun P System and method for performing auxillary storage operations
US7818287B2 (en) 2004-11-12 2010-10-19 Nec Corporation Storage management system and method and program
US7533331B2 (en) 2004-11-22 2009-05-12 Research In Motion Limited System and method for securely adding redundancy to an electronic message
US20060126615A1 (en) 2004-12-10 2006-06-15 Angtin Matthew J Transferring data among a logical layer, physical layer, and storage device
US20060129771A1 (en) 2004-12-14 2006-06-15 International Business Machines Corporation Managing data migration
US7721292B2 (en) 2004-12-16 2010-05-18 International Business Machines Corporation System for adjusting resource allocation to a logical partition based on rate of page swaps and utilization by changing a boot configuration file
US7493314B2 (en) 2005-01-10 2009-02-17 Cyberlink Corp. System and method for providing access to computer files across computer operating systems
US7451166B2 (en) 2005-01-13 2008-11-11 International Business Machines Corporation System and method for maintaining checkpoints of a keyed data structure using a sequential log
US20060206547A1 (en) 2005-02-08 2006-09-14 Raghavendra Kulkarni Storing and retrieving computer data files using an encrypted network drive file system
US20060206621A1 (en) 2005-03-08 2006-09-14 John Toebes Movement of data in a distributed database system to a storage location closest to a center of activity for the data
US20060259587A1 (en) 2005-03-21 2006-11-16 Ackerman Steve F Conserving file system with backup and validation
US7577687B2 (en) 2005-03-31 2009-08-18 Ubs Ag Systems and methods for synchronizing databases
US7320059B1 (en) 2005-08-26 2008-01-15 Emc Corporation Methods and apparatus for deleting content from a storage system
US20070067399A1 (en) 2005-09-22 2007-03-22 Raghavendra Kulkarni Electronic mail archiving system and method
US20070079170A1 (en) 2005-09-30 2007-04-05 Zimmer Vincent J Data migration in response to predicted disk failure
US8054765B2 (en) 2005-10-21 2011-11-08 Emc Corporation Systems and methods for providing variable protection
US7409522B1 (en) 2005-10-26 2008-08-05 Network Appliance, Inc. Method and system for reallocating data in a file system
US7716445B2 (en) 2005-11-04 2010-05-11 Oracle America, Inc. Method and system for storing a sparse file using fill counts
US20070106863A1 (en) 2005-11-04 2007-05-10 Sun Microsystems, Inc. Method and system for storing a sparse file using fill counts
US7672779B2 (en) 2005-11-10 2010-03-02 Tele Atlas North America Inc. System and method for using universal location referencing objects to provide geographic item information
US7899990B2 (en) 2005-11-15 2011-03-01 Oracle America, Inc. Power conservation via DRAM access
US20070118573A1 (en) 2005-11-23 2007-05-24 Solix, Inc. System and method to create a subset of a database
US7831795B2 (en) 2005-11-28 2010-11-09 Commvault Systems, Inc. Systems and methods for classifying and transferring information in a storage network
US7668884B2 (en) 2005-11-28 2010-02-23 Commvault Systems, Inc. Systems and methods for classifying and transferring information in a storage network
US7657550B2 (en) 2005-11-28 2010-02-02 Commvault Systems, Inc. User interfaces and methods for managing data in a metabase
US20070179995A1 (en) 2005-11-28 2007-08-02 Anand Prahlad Metabase for facilitating data classification
US7747579B2 (en) 2005-11-28 2010-06-29 Commvault Systems, Inc. Metabase for facilitating data classification
US20070136200A1 (en) 2005-12-09 2007-06-14 Microsoft Corporation Backup broker for private, integral and affordable distributed storage
US7661028B2 (en) 2005-12-19 2010-02-09 Commvault Systems, Inc. Rolling cache configuration for a data replication system
US20070226535A1 (en) * 2005-12-19 2007-09-27 Parag Gokhale Systems and methods of unified reconstruction in storage systems
US20070156998A1 (en) 2005-12-21 2007-07-05 Gorobets Sergey A Methods for memory allocation in non-volatile memories with a directly mapped file storage system
US7831793B2 (en) 2006-03-01 2010-11-09 Quantum Corporation Data storage system including unique block pool manager and applications in tiered storage
US7603529B1 (en) 2006-03-22 2009-10-13 Emc Corporation Methods, systems, and computer program products for mapped logical unit (MLU) replications, storage, and retrieval in a redundant array of inexpensive disks (RAID) environment
US20070233638A1 (en) * 2006-03-31 2007-10-04 International Business Machines Corporation Method and system for providing cost model data for tuning of query cache memory in databases
US7478113B1 (en) 2006-04-13 2009-01-13 Symantec Operating Corporation Boundaries
US7685459B1 (en) 2006-04-13 2010-03-23 Symantec Operating Corporation Parallel backup
US7376805B2 (en) 2006-04-21 2008-05-20 Hewlett-Packard Development Company, L.P. Distributed storage array
US8165221B2 (en) 2006-04-28 2012-04-24 Netapp, Inc. System and method for sampling based elimination of duplicate data
US7512745B2 (en) 2006-04-28 2009-03-31 International Business Machines Corporation Method for garbage collection in heterogeneous multiprocessor systems
US20070260476A1 (en) 2006-05-05 2007-11-08 Lockheed Martin Corporation System and method for immutably cataloging electronic assets in a large-scale computer system
US20070271316A1 (en) 2006-05-22 2007-11-22 I3Archives, Inc. System and method for backing up medical records
US20070288534A1 (en) 2006-06-07 2007-12-13 Dorota Zak Backup and recovery of integrated linked databases
US7480782B2 (en) 2006-06-14 2009-01-20 Sun Microsystems, Inc. Reference-updating using per-chunk referenced-address ranges in a compacting garbage collector
US7636824B1 (en) 2006-06-28 2009-12-22 Acronis Inc. System and method for efficient backup using hashes
US8412682B2 (en) 2006-06-29 2013-04-02 Netapp, Inc. System and method for retrieving and using block fingerprints for data deduplication
US7921077B2 (en) 2006-06-29 2011-04-05 Netapp, Inc. System and method for managing data deduplication of storage systems utilizing persistent consistency point images
US8296260B2 (en) 2006-06-29 2012-10-23 Netapp, Inc. System and method for managing data deduplication of storage systems utilizing persistent consistency point images
US7873599B2 (en) 2006-07-27 2011-01-18 Hitachi, Ltd. Backup control apparatus and method eliminating duplication of information resources
US7831707B2 (en) 2006-08-02 2010-11-09 Scenera Technologies, Llc Methods, systems, and computer program products for managing electronic subscriptions
US7747584B1 (en) 2006-08-22 2010-06-29 Netapp, Inc. System and method for enabling de-duplication in a storage system architecture
US20080104291A1 (en) 2006-09-29 2008-05-01 United States of America as represented by the Administrator of the National Aeronautics and Flash drive memory apparatus and method
US20080082714A1 (en) 2006-09-29 2008-04-03 Nasa Hq's. Systems, methods and apparatus for flash drive
US7673089B2 (en) 2006-09-29 2010-03-02 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Flash drive memory apparatus and method
US7685177B1 (en) 2006-10-03 2010-03-23 Emc Corporation Detecting and managing orphan files between primary and secondary data stores
US8078603B1 (en) 2006-10-05 2011-12-13 Blinkx Uk Ltd Various methods and apparatuses for moving thumbnails
US7493456B2 (en) 2006-10-13 2009-02-17 International Business Machines Corporation Memory queue with supplemental locations for consecutive addresses
US7882077B2 (en) 2006-10-17 2011-02-01 Commvault Systems, Inc. Method and system for offline indexing of content and classifying stored data
US20080098083A1 (en) 2006-10-19 2008-04-24 Oracle International Corporation System and method for data de-duplication
US8112357B2 (en) 2006-11-07 2012-02-07 Federal Reserve Bank Of Atlanta Systems and methods for preventing duplicative electronic check processing
US8909881B2 (en) 2006-11-28 2014-12-09 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
US8295875B2 (en) 2006-11-28 2012-10-23 Fujitsu Toshiba Mobile Communications Limited Apparatus and method for mobile communication by using non-volatile memory device
US20080126543A1 (en) 2006-11-29 2008-05-29 Hamada Gen Data Management Server, Data Management System, Data Management Method, and Program
US8140786B2 (en) 2006-12-04 2012-03-20 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
US8392677B2 (en) 2006-12-04 2013-03-05 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
WO2008070688A1 (en) 2006-12-04 2008-06-12 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
US20080229037A1 (en) 2006-12-04 2008-09-18 Alan Bunte Systems and methods for creating copies of data, such as archive copies
US7840537B2 (en) 2006-12-22 2010-11-23 Commvault Systems, Inc. System and method for storing redundant information
US20180364914A1 (en) 2006-12-22 2018-12-20 Commvault Systems, Inc. System and method for storing redundant information
US8712969B2 (en) 2006-12-22 2014-04-29 Commvault Systems, Inc. System and method for storing redundant information
WO2008080140A2 (en) 2006-12-22 2008-07-03 Commvault Systems, Inc. System and method for storing redundant information
US8037028B2 (en) 2006-12-22 2011-10-11 Commvault Systems, Inc. System and method for storing redundant information
US8285683B2 (en) 2006-12-22 2012-10-09 Commvault Systems, Inc. System and method for storing redundant information
US20160124658A1 (en) 2006-12-22 2016-05-05 Commvault Systems, Inc. System and method for storing redundant information
US20080243914A1 (en) 2006-12-22 2008-10-02 Anand Prahlad System and method for storing redundant information
US7953706B2 (en) 2006-12-22 2011-05-31 Commvault Systems, Inc. System and method for storing redundant information
US9236079B2 (en) 2006-12-22 2016-01-12 Commvault Systems, Inc. System and method for storing redundant information
US20080162597A1 (en) 2006-12-27 2008-07-03 Research In Motion Limited Method and apparatus for synchronizing databases connected by wireless interface
US20080162518A1 (en) 2007-01-03 2008-07-03 International Business Machines Corporation Data aggregation and grooming in multiple geo-locations
US7788230B2 (en) 2007-01-23 2010-08-31 International Business Machines Corporation Backing-up and restoring files including files referenced with multiple file names
US7870486B2 (en) 2007-01-26 2011-01-11 Kabushiki Kaisha Toshiba System and method for simultaneously commencing output of disparately encoded electronic documents
US7853750B2 (en) 2007-01-30 2010-12-14 Netapp, Inc. Method and an apparatus to store data patterns
US7672981B1 (en) 2007-02-28 2010-03-02 Emc Corporation Object classification and indexing of very large name spaces using grid technology
US20080244172A1 (en) 2007-03-29 2008-10-02 Yoshiki Kano Method and apparatus for de-duplication after mirror operation
US20080244204A1 (en) 2007-03-29 2008-10-02 Nick Cremelie Replication and restoration of single-instance storage pools
US20080243769A1 (en) 2007-03-30 2008-10-02 Symantec Corporation System and method for exporting data directly from deduplication storage to non-deduplication storage
US8213540B1 (en) 2007-04-27 2012-07-03 Marvell International Ltd. System and method of transmit beam selection
US8315984B2 (en) 2007-05-22 2012-11-20 Netapp, Inc. System and method for on-the-fly elimination of redundant data
US20080307000A1 (en) 2007-06-08 2008-12-11 Toby Charles Wood Paterson Electronic Backup of Applications
US20090012984A1 (en) 2007-07-02 2009-01-08 Equivio Ltd. Method for Organizing Large Numbers of Documents
US8028106B2 (en) 2007-07-06 2011-09-27 Proster Systems, Inc. Hardware acceleration of commonality factoring with removable media
US20090049260A1 (en) 2007-08-13 2009-02-19 Upadhyayula Shivarama Narasimh High performance data deduplication in a virtual tape system
US8271992B2 (en) 2007-08-29 2012-09-18 Nirvanix, Inc. Load based file allocation among a plurality of storage devices
US8880797B2 (en) 2007-09-05 2014-11-04 Emc Corporation De-duplication in a virtualized server environment
US20090083341A1 (en) 2007-09-21 2009-03-26 International Business Machines Corporation Ensuring that the archival data deleted in relational source table is already stored in relational target table
US8051367B2 (en) 2007-09-26 2011-11-01 Hitachi, Ltd. Storage sub-system and method for controlling the same
US20090083344A1 (en) 2007-09-26 2009-03-26 Hitachi, Ltd. Computer system, management computer, and file management method for file consolidation
US8156279B2 (en) 2007-09-28 2012-04-10 Hitachi, Ltd. Storage device and deduplication method
US8170994B2 (en) 2007-09-28 2012-05-01 Symantec Corporation Techniques for virtual archiving
US7818495B2 (en) 2007-09-28 2010-10-19 Hitachi, Ltd. Storage device and deduplication method
US20090106369A1 (en) 2007-10-18 2009-04-23 Yen-Fu Chen Duplicate email address detection for a contact
US8769185B2 (en) 2007-10-23 2014-07-01 Keicy Chung Computer storage device having separate read-only space and read-write space, removable media component, system management interface, and network interface
US8782368B2 (en) 2007-10-25 2014-07-15 Hewlett-Packard Development Company, L.P. Storing chunks in containers
US20090112870A1 (en) 2007-10-31 2009-04-30 Microsoft Corporation Management of distributed storage
US20090119678A1 (en) 2007-11-02 2009-05-07 Jimmy Shih Systems and methods for supporting downloadable applications on a portable client device
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US20090204650A1 (en) 2007-11-15 2009-08-13 Attune Systems, Inc. File Deduplication using Copy-on-Write Storage Tiers
US7870105B2 (en) 2007-11-20 2011-01-11 Hitachi, Ltd. Methods and apparatus for deduplication in storage system
US20090150498A1 (en) 2007-12-07 2009-06-11 Steven Joseph Branda Identifying a Plurality of Related Electronic Messages and Combining the Plurality of Related Messages Into a Composite View
US8055618B2 (en) 2007-12-28 2011-11-08 International Business Machines Corporation Data deduplication by separating data from meta data
US7962452B2 (en) 2007-12-28 2011-06-14 International Business Machines Corporation Data deduplication by separating data from meta data
US8190835B1 (en) 2007-12-31 2012-05-29 Emc Corporation Global de-duplication in shared architectures
US8156092B2 (en) 2008-01-29 2012-04-10 Hewett Jeffrey R Document de-duplication and modification detection
US8296301B2 (en) 2008-01-30 2012-10-23 Commvault Systems, Inc. Systems and methods for probabilistic data classification
US20090204636A1 (en) 2008-02-11 2009-08-13 Microsoft Corporation Multimodal object de-duplication
US20090228446A1 (en) 2008-03-06 2009-09-10 Hitachi, Ltd. Method for controlling load balancing in heterogeneous computer system
US8234444B2 (en) 2008-03-11 2012-07-31 International Business Machines Corporation Apparatus and method to select a deduplication protocol for a data storage library
US8346730B2 (en) 2008-04-25 2013-01-01 Netapp. Inc. Deduplication of data on disk devices based on a threshold number of sequential blocks
US20090268903A1 (en) 2008-04-25 2009-10-29 Netapp, Inc. Network storage server with integrated encryption, compression and deduplication capability
US20090271454A1 (en) 2008-04-29 2009-10-29 International Business Machines Corporation Enhanced method and system for assuring integrity of deduplicated data
US20090281847A1 (en) 2008-05-08 2009-11-12 International Business Machines Corporation (Ibm) Method and System For Data Disaggregation
US20100082529A1 (en) 2008-05-08 2010-04-01 Riverbed Technology, Inc. Log Structured Content Addressable Deduplicating Storage
US20090319534A1 (en) 2008-06-24 2009-12-24 Parag Gokhale Application-aware and remote single instance data management
US8484162B2 (en) 2008-06-24 2013-07-09 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US8219524B2 (en) 2008-06-24 2012-07-10 Commvault Systems, Inc. Application-aware and remote single instance data management
US20120271793A1 (en) 2008-06-24 2012-10-25 Parag Gokhale Application-aware and remote single instance data management
US20090327625A1 (en) 2008-06-30 2009-12-31 International Business Machines Corporation Managing metadata for data blocks used in a deduplication system
US8041907B1 (en) 2008-06-30 2011-10-18 Symantec Operating Corporation Method and system for efficient space management for single-instance-storage volumes
US8380957B2 (en) 2008-07-03 2013-02-19 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US8166263B2 (en) 2008-07-03 2012-04-24 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US20100036887A1 (en) 2008-08-05 2010-02-11 International Business Machines Corporation Efficient transfer of deduplicated data
US8086799B2 (en) 2008-08-12 2011-12-27 Netapp, Inc. Scalable deduplication of stored data
US8239348B1 (en) 2008-08-14 2012-08-07 Symantec Corporation Method and apparatus for automatically archiving data items from backup storage
US8190823B2 (en) 2008-09-18 2012-05-29 Lenovo (Singapore) Pte. Ltd. Apparatus, system and method for storage cache deduplication
US8620845B2 (en) 2008-09-24 2013-12-31 Timothy John Stoakes Identifying application metadata in a backup stream
US20150205678A1 (en) 2008-09-26 2015-07-23 Commvault Systems, Inc. Systems and methods for managing single instancing data
US9015181B2 (en) 2008-09-26 2015-04-21 Commvault Systems, Inc. Systems and methods for managing single instancing data
US20100082672A1 (en) 2008-09-26 2010-04-01 Rajiv Kottomtharayil Systems and methods for managing single instancing data
US8386436B2 (en) 2008-09-30 2013-02-26 Rainstor Limited System and method for data storage
US20100088296A1 (en) 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US8626723B2 (en) 2008-10-14 2014-01-07 Vmware, Inc. Storage-network de-duplication
US8131687B2 (en) 2008-11-13 2012-03-06 International Business Machines Corporation File system with internal deduplication and management of data blocks
US8412677B2 (en) 2008-11-26 2013-04-02 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US8725687B2 (en) 2008-11-26 2014-05-13 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US20140250088A1 (en) 2008-11-26 2014-09-04 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US20100138500A1 (en) 2008-12-03 2010-06-03 Microsoft Corporation Online Archiving of Message Objects
US8712974B2 (en) 2008-12-22 2014-04-29 Google Inc. Asynchronous distributed de-duplication for replicated content addressable storage clusters
US9773025B2 (en) 2009-03-30 2017-09-26 Commvault Systems, Inc. Storing a variable number of instances of data objects
US20180144000A1 (en) 2009-03-30 2018-05-24 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8095756B1 (en) 2009-04-28 2012-01-10 Netapp, Inc. System and method for coordinating deduplication operations and backup operations of a storage volume
US20100281081A1 (en) 2009-04-29 2010-11-04 Netapp, Inc. Predicting space reclamation in deduplicated datasets
US9058117B2 (en) 2009-05-22 2015-06-16 Commvault Systems, Inc. Block-level single instancing
US20150199242A1 (en) 2009-05-22 2015-07-16 Commvault Systems, Inc. Block-level single instancing
US8578120B2 (en) 2009-05-22 2013-11-05 Commvault Systems, Inc. Block-level single instancing
US20100332401A1 (en) 2009-06-30 2010-12-30 Anand Prahlad Performing data storage operations with a cloud storage environment, including automatically selecting among multiple cloud storage sites
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
US8244914B1 (en) 2009-07-31 2012-08-14 Symantec Corporation Systems and methods for restoring email databases
US20110125711A1 (en) 2009-11-23 2011-05-26 David Norman Richard Meisenheimer Generating device specific thumbnails
US8965852B2 (en) 2009-11-24 2015-02-24 Dell Products L.P. Methods and apparatus for network efficient deduplication
US8725698B2 (en) 2010-03-30 2014-05-13 Commvault Systems, Inc. Stub file prioritization in a data replication system
US8352422B2 (en) 2010-03-30 2013-01-08 Commvault Systems, Inc. Data restore systems and methods in a replication environment
US8504515B2 (en) 2010-03-30 2013-08-06 Commvault Systems, Inc. Stubbing systems and methods in a data replication environment
US20140067764A1 (en) 2010-03-30 2014-03-06 Commvault Systems, Inc. Stubbing systems and methods in a data replication environment
US20130117305A1 (en) 2010-07-21 2013-05-09 Sqream Technologies Ltd System and Method for the Parallel Execution of Database Queries Over CPUs and Multi Core Processors
US9262275B2 (en) 2010-09-30 2016-02-16 Commvault Systems, Inc. Archiving data objects using secondary copies
US8935492B2 (en) 2010-09-30 2015-01-13 Commvault Systems, Inc. Archiving data objects using secondary copies
US8364652B2 (en) 2010-09-30 2013-01-29 Commvault Systems, Inc. Content aligned block-based deduplication
US9639563B2 (en) 2010-09-30 2017-05-02 Commvault Systems, Inc. Archiving data objects using secondary copies
US8578109B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US8577851B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Content aligned block-based deduplication
US20170206206A1 (en) 2010-09-30 2017-07-20 Commvault Systems, Inc. Archiving data objects using secondary copies
US8572340B2 (en) 2010-09-30 2013-10-29 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US20120102286A1 (en) 2010-10-26 2012-04-26 Holt Keith W Methods and structure for online migration of data in storage systems comprising a plurality of storage devices
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US20120150818A1 (en) 2010-12-14 2012-06-14 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9104623B2 (en) 2010-12-14 2015-08-11 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US8954446B2 (en) 2010-12-14 2015-02-10 Comm Vault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9116850B2 (en) 2010-12-14 2015-08-25 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US20120159098A1 (en) 2010-12-17 2012-06-21 Microsoft Corporation Garbage collection and hotspots relief for a data deduplication chunk store
US20120233417A1 (en) 2011-03-11 2012-09-13 Microsoft Corporation Backup and restore strategies for data deduplication
US20120311581A1 (en) 2011-05-31 2012-12-06 International Business Machines Corporation Adaptive parallel data processing
US20130041872A1 (en) 2011-08-12 2013-02-14 Alexander AIZMAN Cloud storage system with distributed metadata
US20130262801A1 (en) 2011-09-30 2013-10-03 Commvault Systems, Inc. Information management of virtual machines having mapped storage devices
US20130086007A1 (en) 2011-09-30 2013-04-04 Symantec Corporation System and method for filesystem deduplication using variable length sharing
US20130218350A1 (en) 2012-02-21 2013-08-22 Andrew Manzo System and Method for Real-Time Controls of Energy Consuming Devices Including Tiered Architecture
US9372479B1 (en) 2012-02-21 2016-06-21 Omniboard, Inc. System and method for a database layer for managing a set of energy consuming devices
US20150205817A1 (en) 2012-03-30 2015-07-23 Commvault Systems, Inc. Smart archiving and data previewing for mobile devices
US9063938B2 (en) 2012-03-30 2015-06-23 Commvault Systems, Inc. Search filtered file system using secondary storage, including multi-dimensional indexing and searching of archived files
US9020890B2 (en) 2012-03-30 2015-04-28 Commvault Systems, Inc. Smart archiving and data previewing for mobile devices
US20130262394A1 (en) 2012-03-30 2013-10-03 Commvault Systems, Inc. Search filtered file system using secondary storage
US20130290598A1 (en) 2012-04-25 2013-10-31 International Business Machines Corporation Reducing Power Consumption by Migration of Data within a Tiered Storage System
US9218376B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Intelligent data sourcing in a networked storage system
US20130339298A1 (en) 2012-06-13 2013-12-19 Commvault Systems, Inc. Collaborative backup in a networked storage system
US20130339310A1 (en) 2012-06-13 2013-12-19 Commvault Systems, Inc. Restore using a client side signature repository in a networked storage system
US9218374B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Collaborative restore in a networked storage system
US9218375B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Dedicated client-side signature generator in a networked storage system
US9251186B2 (en) 2012-06-13 2016-02-02 Commvault Systems, Inc. Backup using a client-side signature repository in a networked storage system
US20140006382A1 (en) 2012-06-29 2014-01-02 International Business Machines Corporation Predicate pushdown with late materialization in database query processing
US20140012814A1 (en) 2012-07-06 2014-01-09 Box, Inc. System and method for performing shard migration to support functions of a cloud-based service
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US9026498B2 (en) 2012-08-13 2015-05-05 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US20140129961A1 (en) 2012-11-07 2014-05-08 Sergey Mikhailovich Zubarev Tool for managing user task information
US20140188532A1 (en) 2012-11-13 2014-07-03 Nec Laboratories America, Inc. Multitenant Database Placement with a Cost Based Query Scheduler
US20140181079A1 (en) 2012-12-20 2014-06-26 Teradata Corporation Adaptive optimization of iterative or recursive query execution by database systems
US10338823B2 (en) 2012-12-21 2019-07-02 Commvault Systems, Inc. Archiving using data obtained during backup of primary storage
US9223597B2 (en) 2012-12-21 2015-12-29 Commvault Systems, Inc. Archiving virtual machines in a data storage system
US9069799B2 (en) 2012-12-27 2015-06-30 Commvault Systems, Inc. Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system
US20150269035A1 (en) 2012-12-27 2015-09-24 Commvault Systems, Inc. Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system
US20170083408A1 (en) 2012-12-28 2017-03-23 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US20180239772A1 (en) 2012-12-28 2018-08-23 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US9633022B2 (en) 2012-12-28 2017-04-25 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US9665591B2 (en) 2013-01-11 2017-05-30 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9633033B2 (en) 2013-01-11 2017-04-25 Commvault Systems, Inc. High availability distributed deduplicated storage system
US20170031707A1 (en) 2013-01-14 2017-02-02 Commvault Systems, Inc. Creation of virtual machine placeholders in a data storage system
US20140201485A1 (en) 2013-01-14 2014-07-17 Commvault Systems, Inc. Pst file archiving
US9286110B2 (en) 2013-01-14 2016-03-15 Commvault Systems, Inc. Seamless virtual machine recall in a data storage system
US9652283B2 (en) 2013-01-14 2017-05-16 Commvault Systems, Inc. Creation of virtual machine placeholders in a data storage system
US20140310232A1 (en) 2013-04-11 2014-10-16 Hasso-Plattner-Institut für Softwaresystemtechnik GmbH Aggregate query-caching in databases architectures with a differential buffer and a main store
US9646166B2 (en) 2013-08-05 2017-05-09 International Business Machines Corporation Masking query data access pattern in encrypted data
US9939981B2 (en) 2013-09-12 2018-04-10 Commvault Systems, Inc. File manager integration with virtualization in an information management system with an enhanced storage manager, including user control and storage management of virtual machines
US20150178277A1 (en) 2013-12-23 2015-06-25 Tata Consultancy Services Limited System and method predicting effect of cache on query elapsed response time during application development stage
US10324897B2 (en) 2014-01-27 2019-06-18 Commvault Systems, Inc. Techniques for serving archived electronic mail
US20150212889A1 (en) 2014-01-27 2015-07-30 Commvault Systems, Inc. Techniques for serving archived electronic mail
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US9276871B1 (en) 2014-03-20 2016-03-01 Cisco Technology, Inc. LISP stretched subnet mode for data center migrations
US10169162B2 (en) 2014-06-11 2019-01-01 Commvault Systems, Inc. Conveying value of implementing an integrated data management and protection system
US20150363270A1 (en) 2014-06-11 2015-12-17 Commvault Systems, Inc. Conveying value of implementing an integrated data management and protection system
US20160019224A1 (en) 2014-07-18 2016-01-21 Commvault Systems, Inc. File system content archiving based on third-party application archiving rules and metadata
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9848046B2 (en) 2014-11-13 2017-12-19 Commvault Systems, Inc. Archiving applications in information management systems
US20160179435A1 (en) 2014-12-19 2016-06-23 Oracle International Corporation Systems and methods for shadow migration progress estimation
US9781000B1 (en) 2014-12-22 2017-10-03 EMC IP Holding Company LLC Storage mobility using locator-identifier separation protocol
US20160210064A1 (en) 2015-01-21 2016-07-21 Commvault Systems, Inc. Database protection using block-level mapping
US10223212B2 (en) 2015-01-21 2019-03-05 Commvault Systems, Inc. Restoring archived object-level database data
US20160253254A1 (en) 2015-02-27 2016-09-01 Commvault Systems, Inc. Diagnosing errors in data storage and archiving in a cloud or networking environment
US9928144B2 (en) 2015-03-30 2018-03-27 Commvault Systems, Inc. Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage
US10339106B2 (en) 2015-04-09 2019-07-02 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US20160299818A1 (en) 2015-04-09 2016-10-13 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US9213540B1 (en) 2015-05-05 2015-12-15 Archive Solutions Providers Automated workflow management system for application and data retirement
US10089337B2 (en) 2015-05-20 2018-10-02 Commvault Systems, Inc. Predicting scale of data migration between production and archive storage systems, such as for enterprise customers having large and/or numerous files
US20190042609A1 (en) 2015-05-20 2019-02-07 Commvault Systems, Inc. Predicting scale of data migration
US20160342661A1 (en) 2015-05-20 2016-11-24 Commvault Systems, Inc. Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files
US20160342633A1 (en) 2015-05-20 2016-11-24 Commvault Systems, Inc. Predicting scale of data migration between production and archive storage systems, such as for enterprise customers having large and/or numerous files
US10481824B2 (en) 2015-05-26 2019-11-19 Commvault Systems, Inc. Replication using deduplicated secondary copy data
US10310953B2 (en) 2015-12-30 2019-06-04 Commvault Systems, Inc. System for redirecting requests after a secondary storage computing device failure
US20180137139A1 (en) 2016-11-16 2018-05-17 Commvault Systems, Inc. Dynamically configuring a proxy server using containerization for concurrent and/or overlapping backup, restore, and/or test operations
US20180288150A1 (en) 2017-03-28 2018-10-04 Commvault Systems, Inc. Archiving mail servers via a simple mail transfer protocol (smtp) server
US20190108341A1 (en) 2017-09-14 2019-04-11 Commvault Systems, Inc. Ransomware detection and data pruning management
US10742735B2 (en) 2017-12-12 2020-08-11 Commvault Systems, Inc. Enhanced network attached storage (NAS) services interfacing to cloud storage

Non-Patent Citations (50)

* Cited by examiner, † Cited by third party
Title
Anonymous, "NTFS Sparse Files (NTFS5 Only)", Jun. 4, 2002, pp. 1-1, https://web.archive.org/web/20020604013016/https://ntfs.com/ntfs-sparse.htm.
Armstead et al., "Implementation of a Campwide Distributed Mass Storage Service: The Dream vs. Reality," IEEE, Sep. 11-14, 1995, pp. 190-199.
Arneson, "Mass Storage Archiving in Network Environments," Digest of Papers, Ninth IEEE Symposium on Mass Storage Systems, Oct. 31, 1988-Nov. 3, 1988, pp. 45-50, Monterey, CA.
Cabrera et al., "ADSM: A Multi-Platform, Scalable, Backup and Archive Mass Storage System," Digest of Papers, Compcon '95, Proceedings of the 40th IEEE Computer Society International Conference, Mar. 5, 1995-Mar. 9, 1995, pp. 420-427, San Francisco, CA.
Commvault Systems, Inc., "Continuous Data Replicator 7.0," Product Data Sheet, 2007, 6 pages.
CommVault Systems, Inc., "Deduplication," <https://documentation.commvault.com/commvault/release_8_0_0/books_online_1/english_US/features/single_instance/single_instance.htm>, earliest known publication date: Jan. 26, 2009, 9 pages.
CommVault Systems, Inc., "Deduplication—How To," <https://documentation.commvault.com/commvault/release_8_0_0/books_online_1/english_US/features/single_instance/single_instance_how_to.htm>, earliest known publication date: Jan. 26, 2009, 7 pages.
Computer Hope, "File," May 21, 2008, pp. 1-3, https://web.archive.org/web/20080513021935/https://www.computerhope.com/jargon/f/file.htm.
Diligent Technologies "HyperFactor," <https://www.diligent.com/products:protecTIER-1:HyperFactor-1>, Internet accessed on Dec. 5, 2008, 2 pages.
Eitel, "Backup and Storage Management in Distributed Heterogeneous Environments," IEEE, Jun. 12-16, 1994, pp. 124-126.
Enterprise Storage Management, "What Is Hierarchical Storage Management?", Jun. 19, 2005, pp. 1, https://web.archive.org/web/20050619000521/hhttps://www.enterprisestoragemanagement.com/faq/hierarchical-storage-management-shtml.
Enterprise Storage Management, What Is A Incremental Backup?, Oct. 26, 2005, pp. 1-2, https://web.archive.org/web/w0051026010908/https://www.enterprisestoragemanagement.com/faq/incremental-backup.shtml.
Examination Report dated Dec. 14, 2018 in European Patent Application No. 09816825.5, 7 pages.
Extended European Search Report for 09816825.5; dated Oct. 27, 2015, 15 pages.
Extended European Search Report for EP07865192.4; dated May 2, 2013, 7 pages.
Federal Information Processing Standards Publication 180-2, "Secure Hash Standard", Aug. 1, 2002, <https://csrc.nist.gov/publications/fips/fips1 80-2/fips 1 80-2withchangenotice. pdf>, 83 pages.
FlexHex, "NTFS Sparse Files for Programmers", Feb. 22, 2006, pp. 1-4, https://web.archive.org/web/20060222050807/https://www.flexhex.com/docs/articles/sparse-files.phtml.
Gait, J., "The Optical File Cabinet: A Random-Access File System For Write-Once Optical Disks," IEEE Computer, vol. 21, No. 6, pp. 11-22 (Jun. 1988).
Geer, D., "Reducing The Storage Burden Via Data Deduplication," IEEE, Computer Journal, vol. 41, Issue 12, Dec. 2008, pp. 15-17.
Handy, Jim, "The Cache Memory Book: The Authoritative Reference on Cache Design," Second Edition, 1998, pp. 64-67 and pp. 204-205.
International Preliminary Report on Patentability and Written Opinion for PCT/US2007/086421, dated Jun. 18, 2009, 8 pages.
International Preliminary Report on Patentability and Written Opinion for PCT/US2011/054378, dated Apr. 11, 2013, 5 pages.
International Search Report and Written Opinion for PCT/US07/86421, dated Apr. 18, 2008, 9 pages.
International Search Report for Application No. PCT/US09/58137, dated Dec. 23, 2009, 14 pages.
International Search Report for Application No. PCT/US10/34676, dated Nov. 29, 2010, 9 pages.
International Search Report for Application No. PCT/US11/54378, dated May 2, 2012, 8 pages.
Jander, M., "Launching Storage-Area Net," Data Communications, US, McGraw Hill, NY, vol. 27, No. 4 (Mar. 21, 1998), pp. 64-72.
Kornblum, Jesse, "Identifying Almost Identical Files Using Context Triggered Piecewise Hashing," www.sciencedirect.com, Digital Investigation 3S (2006), pp. S91-S97.
Kulkarni P. et al., "Redundancy elimination within large collections of files," Proceedings of the Usenix Annual Technical Conference, Jul. 2, 2004, pp. 59-72.
Lortu Software Development, "Kondar Technology—Deduplication," <https://www.lortu.com/en/deduplication.asp>, Internet accessed on Dec. 5, 2008, 3 pages.
Menezes et al., "Handbook Of Applied Cryptography", CRC Press, 1996, <https://www.cacr.math.uwaterloo.ca/hac/aboutlchap9.pdf>, 64 pages.
Microsoft, "Computer Dictionary," Fifth Edition, 2002, p. 220.
Microsoft, "Computer Dictionary", p. 249, Fifth Edition, 2002, 3 pages.
Microsoft, "Computer Dictionary", pp. 142, 150, 192, and 538, Fifth Edition, 2002, 6 pages.
Overland Storage, "Data Deduplication," <https://www.overlandstorage.com/topics/data_deduplication.html>, Internet accessed on Dec. 5, 2008, 2 pages.
Partial Supplementary European Search Report in Application No. 09816825.5, dated Apr. 15, 2015, 6 pages.
Quantum Corporation, "Data De-Duplication Background: A Technical White Paper," May 2008, 13 pages.
Rosenblum et al., "The Design and Implementation of a Log-Structured File System," Operating Systems Review SIGOPS, vol. 25, No. 5, New York, US, pp. 1-15 (May 1991).
SearchStorage, "File System", Nov. 1998, <https://searchstorage.techtarget.com/definition/file-system>, 10 pages.
Sharif, A., "Cache Memory," Sep. 2005, https://searchstorage.techtarget.com/definition/cache-memory, pp. 1-26.
Techterms.com, "File," May 17, 2008, 1 page, <https://web.archive.org/web/20080517102848/https://techterms.com/definition/file>.
U.S. Appl. No. 16/380,469, filed Apr. 10, 2019, Vijayan et al.
U.S. Appl. No. 16/407,040, filed May 8, 2019, Ngo.
Webopedia, "Cache," Apr. 11, 2001, https://web.archive.org/web/20010411033304/https://www.webopedia.com/TERM/c/cache.html pp. 1-4.
Webopedia, "Data Duplication", Aug. 31, 2006, <https://web.archive.org/web/20060913030559/https://www.webopedia.com/TERMID/data_deduplication.html>, 2 pages.
Webopedia, "File," May 21, 2008, pp. 1-3, <https://web.archive.org/web/20080521094529/https://www.webopedia.com/TERM/F/file.html>.
Webopedia, "Folder", Aug. 9, 2002, <https://web.archive.org/web/20020809211001/https://www.webopedia.com/TERM/F/folder.html> pp. 1-2.
Webopedia, "Logical Drive", Aug. 13, 2004, pp. 1-2, https://web.archive.org/web/20040813033834/https://www.webopedia.com/TERM/L/logical_drive.html.
Webopedia, "LPAR", Aug. 8, 2002, pp. 1-2, https://web.archive.org/web/20020808140639/https://www.webopedia.com/TERM/L/LPAR.html.
Webopedia, "Metadata", Apr. 5, 2001, <https://web.archive.org/web/20010405235507/https://www.webopedia.com/TERM/M/metadata.html>, pp. 1-2.

Also Published As

Publication number Publication date
US10762036B2 (en) 2020-09-01
US20150134924A1 (en) 2015-05-14
US9262275B2 (en) 2016-02-16
US20220309032A1 (en) 2022-09-29
US20200349107A1 (en) 2020-11-05
US20120084524A1 (en) 2012-04-05
US20160224598A1 (en) 2016-08-04
US9639563B2 (en) 2017-05-02
WO2012045023A3 (en) 2012-06-21
US11768800B2 (en) 2023-09-26
WO2012045023A2 (en) 2012-04-05
US20170206206A1 (en) 2017-07-20
US8935492B2 (en) 2015-01-13

Similar Documents

Publication Publication Date Title
US11768800B2 (en) Archiving data objects using secondary copies
US11640338B2 (en) Data recovery operations, such as recovery from modified network data management protocol data
US11003626B2 (en) Creating secondary copies of data based on searches for content
US20190324860A1 (en) Systems and methods for analyzing snapshots
AU2010339584B2 (en) Systems and methods for performing data management operations using snapshots
US9047296B2 (en) Asynchronous methods of data classification using change journals and other data structures
US8335776B2 (en) Distributed indexing system for data storage
US20180144000A1 (en) Storing a variable number of instances of data objects
CA2729078C (en) Systems and methods for managing single instancing data

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: COMMVAULT SYSTEMS, INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GOKHALE, PARAG;KOTTOMTHARAYIL, RAJIV;VARADHARAJAN, PRAKASH;REEL/FRAME:053360/0131

Effective date: 20111005

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, ILLINOIS

Free format text: SECURITY INTEREST;ASSIGNOR:COMMVAULT SYSTEMS, INC.;REEL/FRAME:058496/0836

Effective date: 20211213

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE