1space Multi-Cloud

1space Overview

1space creates a single object namespace between object API endpoints. This enables applications to access data transparently between public cloud and on-premises resources.

1space enables you to manage your data whether it is in your SwiftStack cluster on-premises or in the public cloud. With flexible data management policies, 1space allows you to position your data where you can best take advantage of it.

For example, 1space can provide a lifecycle profile which automatically moves older data to the cloud to make room for new data on-premises. Or, data can be mirrored to a public cloud to take advantage of its compute services. 1space can also be used for live migration of data from a public cloud, or another private cloud storage location to a SwiftStack cluster.

Because of the the single object namespace, data is accessible regardless of where the data is stored. The 1space Cloud Connector can be deployed in the public cloud to provide access to the single object namespace.

1space can be configured to support many needs and workflows. Please read through the feature documentation below and review the example use cases. For any questions or configuration assistance, please contact SwiftStack Support.

Deployment Options

1space is available with the following deployment configurations.

  1. Lifecycle – Move data to cloud storage
  2. Sync / Mirror – Copy data to cloud storage
  3. Cloud Connector - Access data from cloud compute infrastructure
  4. Migrations – Copy data from cloud storage

Lifecycle and Sync / Mirror

1space Lifecycle and Mirror is software that operates on an on-premises SwiftStack Cluster. From its point of view, Lifecycle / Mirror are pushing data to a remote storage location.

Cloud Connector

1space Cloud Connector is containerized software that runs in cloud compute infrastructure and provides read/write access to on-premises SwiftStack cluster. The data feels local to the cloud compute infrastructure but can transparently access data in the SwiftStack cluster. New data is stored in the "local" object storage.

This enables data access portability without needing to move data between on-premises and public cloud unless needed.

Migrations

Migrations also operate on an on-premises SwiftStack cluster. Migrations pull data from a remote storage location.

Configuration Options

1space offers the following configuration options.

Configuration Lifecycle Sync/Mirror Cloud Connector Migrations Description
Single Namespace Yes Yes Yes Yes Merge a SwiftStack Cluster bucket/container with an cloud bucket/container
Restore on GET Yes No No Yes Store a local copy of data upon access, if accessing from “remote” location. For, migrations a GET request would “jump the queue” from planned migration. For Lifecycle, the configured lifecycle policy would restart.
Propagate DELETE option Yes Yes No Yes Forward delete request to remote storage location.
Do not propagate DELETE option Yes Yes No No Do not forward delete request to remote storage location.
Lifecycle Trigger: Immediate Yes No No No Move data after specified time interval (in days).
SwiftStack Namespace Prefix Yes Yes Yes No Use the recommended naming prefix in remote storage location to reduce odds of namespace conflict in remote destination.
Custom Namespace Prefix Yes Yes Yes Yes Enable migration of preexisting data, or lifecycle data to a preexisting storage location.
Metadata-based Triggers Yes Yes No No Enables the ability to trigger lifecycle or Sync/Mirror policies based on object metdatada.

Cloud Storage Targets

1space works with Amazon S3, Google Cloud Storage, and SwiftStack storage clusters. It may also work with other S3- or Swift-compatible systems. See the table below, and please contact support if you have any questions regarding your use case.

1space supports the following targets.

Cloud Storage Lifecycle Sync/Mirror Migrations
AWS S3 Yes Yes Yes
Google Cloud Storage Yes Yes Yes
SwiftStack Cluster Yes Yes Yes
Other qualified AWS S3 API endpoints No No Yes
Other qualified OpenStack Swift API endpoints No No Yes

Configuring 1space

Multiple Swift containers can be mapped to a single cloud bucket. Each 1space mapping consists of the Swift account (e.g. AUTH_test), container, the cloud container (bucket), and a set of cloud credentials.

The processes doing the replication reside on the Swift container nodes. These nodes need network access to the cloud API endpoint in order to replicate the data.

Warning

Beware of using the same bucket for multiple SwiftStack clusters. In that case, if the same account and container exist on two clusters, one cluster may overwrite objects from the other in the bucket.

All of the 1space administration is performed through the 1space tab, on the left side in the Cluster management view. The subsequent sections assume that you have already navigated to that section.

1space Cloud Credentials

Before adding a mapping, a set of credentials must be created. To setup credentials, navigate to the 1space tab in the manage cluster interface. Follow these steps to configure specific providers:

  1. Click on Credentials
../_images/cloud_sync.png
  1. Add credentials, by specifying a friendly Label, Access Key ID, and Secret Access Key
  2. Select Amazon S3 or Google Cloud Storage, depending on which provider you'd like to use.
../_images/cloud_sync_creds.png

Afterward, you can edit the credentials (for example, if the secret keys are rotated). Each set of credentials for a given cluster must be unique, i.e. you cannot add multiple credentials with the same Label.

After editing credentials, you must deploy the configuration for the changes to go into effect.

You can validate your credentials before submitting them. The validation takes place on a node in the cluster. You will not be able to validate your credentials if there are no nodes in the cluster. This validation step only attempts to list buckets/containers to ensure that the credentials are recognized by the service provider.

Note

Google Cloud Storage Interoperability mode must be enabled. See Configuring S3 Interoperability in Google Cloud Storage

Note

When editing credentials, the secret access key will not be displayed. Leaving the field blank will keep the current secret. Supplying a new secret will overwrite the existing value.

Other providers

1space can be used with other providers that implement the S3 or Swift API. To do so:

  1. Select "Other" from the provider drop down
  2. Enter the provider endpoint in the "Endpoint field", for example, https://storage.provider.com. If you are planning to use 1space to sync to another Swift cluster, the URL must be the auth URL, such as https://swift.com/auth/v1.0.
  3. If wish to sync to a Swift cluster, select Swift from the protocol dropdown. By default, S3 will be used to connect to the remote Object store.
../_images/cloud_sync_creds_other.png

Setting up 1space Profiles

All 1space profiles are listed on the Profiles tab.

../_images/cloud_sync_profiles.png

Currently, 1space allows for two kinds of profiles for data under its management: Sync and Lifecycle.

../_images/cloud_sync_profiles_add.png

Sync Profile

The Sync profile (named sync), will copy data into the remote container (S3/Google Cloud Storage bucket or Swift container) as soon as possible. Any objects removed from the source Swift container will also be removed from the remote store (if present).

Lifecyle Profile

The Lifecycle profile copies objects to the remote store after a specified delay. Once an object has been copied, it is also removed from the Swift container. A DELETE request issued against the Swift container will not remove objects in the remote store once they have been copied. In addition, the Lifecycle allows an option to retrieve objects on access. Since the objects are still visible through the Swift API, 1space can place them back into the cluster when a GET request is issued (this is done in-line with serving the request). After restoring the object, it is subject to the same expiration policy (i.e. if the Lifecycle policy expires objects after 1 week, the object will expire after 1 week after restoration).

Do note that most aspects of a profile cannot be changed once set. The only modifications allowed are adjusting the expiration time and whether to restore objects on a GET request for the Lifecycle profile. If the expiration time is increased, objects that have already been copied will not be brought back into the cluster. However, new objects' expiration will be governed by the updated delay time.

Metadata Conditions

1space also supports the ability to set additional metadata conditions to trigger data movement events. After saving the profile, edit the profile to add the metadata keys to be considered. Then configure the conditions that must be true for the lifecycle or sync event to trigger.

../_images/1space-metadata-conditions.png

This functionality enables a subset of the data in namespace to be moved to a public cloud or remote location.

Examples of usage include:

  1. Data lifecycle of a data subset for data processing in the cloud
  2. Triggering lifecycle when a project is done
  3. Mirroring a subset of active data

1space Mappings

Once at least one set of credentials exists, you can create 1space mappings for existing Swift Accounts and Containers. To do so, from the Mappings tab follow these steps:

  1. Click New container mapping

  2. Enter a Swift Account, Container, the name of the Remote Bucket, and select a Credential and 1space Profile to use.

  3. If cloud-connector should not expose the mapping, uncheck "Cloud connector".

  4. If you don't want to use the recommended default prefix for where shared-namespace data is actually stored inside the "Remote Bucket", uncheck "Use Default Prefix" and enter a custom prefix. If you want the data to be stored in a "faux directory", the custom prefix should end with a forward slash (/).

  5. Click "Verify" to ensure the selected credential set has sufficient permissions and that the Swift cluster nodes can successfully talk to the destination cloud.

  6. Finally, click "Save Container Mapping".

    ../_images/cloud_sync_add_mapping.png

1space allows the administrator to enable sync for all containers in the account to a particular bucket. Each container will be placed with a random prefix in the remote bucket. Please see "Swift object representation in S3" for more details on that.

Before creating the mapping, you can verify that 1space will be able to perform all of its required actions. Clicking the "Verify" button runs a short set of checks against the service provider from one of the nodes in the cluster. The checks include calls to PUT, HEAD, POST (or PUT as a server-side-copy for S3), and DELETE within the specified bucket/container. Please see below for an example IAM policy in Amazon S3 required for these tests to pass.

Afterward, each mapping will appear in the table on the 1space page.

../_images/cloud_sync_mappings.png

Changing the 1space profile associated with the mapping does not result in reverting prior operations. For example, when changing from the Lifecycle to Sync type profile, the objects that have been archived and expired will not be copied back into the Swift container from the remote store. Similarly, changing from Sync to Lifecycle, will not result in expiring objects that have already been copied to the remote store. You can force 1space to reprocess all of the data in the Swift container using the reset button (see "Resetting a mapping" for more details).

Resetting a mapping

It is possible that objects from the remote bucket at some point are removed, through an accident or deliberate action. 1space can repopulate all of the missing objects and ensure that all of the date is replicated. You can use the reset button (↻) next to the affected mapping to trigger this action.

The length of the actual process to re-populate the remote data depends on how many objects have been removed. 1space will continue to copy objects in the background until it has iterated through all of the objects in the Swift container and ensured they are replicated.

If a mapping is configured to sync all containers, resetting the mapping will reset the state for each one of them.

1space Cloud Connector

The 1space Cloud Connector allows S3-protocol access to single-namespace 1space mappings from public cloud compute infrastructure.

For more details, see 1space Cloud Connector

1space Data Location

You can find out where the objects are located, by issuing a GET request against the archived container and specifying the json format. 1space will return a content_location entry for each key. It will specify whether an object is in the remote store, in local Swift storage, or both (can happen if the object is restored on a GET).

Object Prefix in Remote Location

Custom Object Prefix

If desired, a specific object prefix may be utilized to specify a specific remote bucket location. This is useful if it is desired to do a 1:1 mapping of a SwiftStack bucket/container to a remote S3 bucket.

A specific location may also be specified for the mapping to connect with a specific path within a remote bucket location.

Note

While a custom object prefix adds configuration flexibility, take care to ensure that the specified custom prefix does not overlap with other locations.

Automatic Hashed Object Prefix

To allow for objects from multiple Swift containers to appear in an S3 bucket, the S3 keys include the account and container. To prevent all keys from being stored with the same prefix for a given account, 1space by default prepends a hashed prefix to each key. The prefixes for each mapping are listed in the 1space configuration table and are deterministically derived from the account and container. For example, if there is an object object in a Swift container container under account AUTH_account, it will be stored in the S3 bucket as 62506b/AUTH_account/container/object.

The prefix for each mapping is listed on the 1space page to make it easier to locate data. 1space allows many Swift accounts and buckets/containers to be replicated to S3 under one account.

If an application needs to be able to determine the prefix programmatically, you can implement the following pseudocode:

hash = md5("<account>/<container>")
prefix = long(hash) % 16^6

In python, this would look as follows:

import hashlib
h = hashlib.md5('AUTH_test/test').hexdigest()
prefix = hex(long(h, 16) % 16**6)[2:-1]

The stripping of the leading two characters and trailing character removes the "0x" and "L" from the hex string. In the example above, the prefix would be 2e122e.

Static Large Objects (SLO)

1space propagates Static Large Object (SLO) manifests to the public cloud using the S3 Multipart Upload (MPU) interface. As Amazon S3 restricts the MPU size of each part (except for the last part) to be between 5MB and 5GB, 1space will reject any SLO manifests that have parts outside that range. For example, an SLO that is composed of two objects, where the size of the first object is 1MB and the second object is 100MB would deemed invalid. In such cases, 1space will report an error in the log file, but continue to process the other objects in the container.

When uploading data to Google Cloud Storage, as the provider does not implement the MPU API, 1space converts the SLO manifest into a single stream. This results in a single file upload and is valid for objects up to 5TB -- largest allowable single object size in Google Cloud Storage.

Limitations

1space does not currently support SLO manifests that have more than 10000 parts. It also does not support SLO manifests composed of other SLO objects. In this case, you should expect the upload to fail or be malformed, depending on whether the nested SLO is at the end of the original manifest.

Lastly, 1space does not support the range option in the SLO manifests. When 1space encounters an SLO with any segment entry that contains a range, an error will be recorded and 1space will move on to other objects.

Dynamic Large Objects (DLO)

If you use Dynamic Large Objects, the segments may be stored in a container that is not the one being synced. To ensure that the content of the DLO are also replicated, configure a mapping for the segments container.

Large objects are not converted into a multi-part upload (or a single object upload in Google Cloud Storage), meaning that each segment will be a separate object in the cloud.

Configuring Amazon S3 IAM Policy

To follow AWS S3 best practices, Amazon recommends using a restricted set of actions. For 1space, the policy can be restricted to a set of buckets on which it operates.

To create an IAM policy, perform the following steps in the AWS Management Console:

  1. Create an IAM policy as explained below
  2. Create or select an IAM group and attach the IAM policy to that group
  3. Assign a user (creating a user if required) into that group
  4. Capture the access key information to use as the access key / secret key when configuring 1space credentials.

Creating an IAM policy

The following requests must be allowed on those resources:

  • PutObject
  • DeleteObject
  • GetObject
  • ListBucket

Here is an example of an IAM policy for 1space on a specific bucket (example-1space-bucket):

"Version": "2012-10-17",
"Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "s3:PutObject",
            "s3:DeleteObject",
            "s3:GetObject"
        ],
        "Resource": [
            "arn:aws:s3:::example-1space-bucket/*"
        ]
    },
    {
        "Effect": "Allow",
        "Action": [
            "s3:ListBucket", "s3:GetBucketLocation"
        ],
        "Resource": [
            "arn:aws:s3:::example-1space-bucket"
        ]
    },
{
    "Effect": "Allow",
    "Action": [
        "s3:ListAllMyBuckets"
    ],
    "Resource": [
        "arn:aws:s3:::*"
    ]
}
]
  • PutObject is required to write objects into S3.
  • DeleteObject is required to delete the objects that have been removed from the Swift container.
  • GetObject used to check whether an object has already been copied through a HEAD request, 1space issues a HEAD request.
  • ListBucket is required for S3 to return 404 (Not Found), as opposed to 403 (Forbidden) if the object is not found (when issuing HEAD).

Lastly, ListAllMyBuckets is required for 1space to validate that credentials are valid when creating a new set.

Configuring S3 Interoperability in Google Cloud Storage

If you haven't already, enable S3 Interoperability by navigating to the Cloud Storage Settings page.

../_images/google-storage-cloud-settings-navigation.png
  1. Select Interoperability
  2. Click Make this my default project
  3. Click Create a new key
  4. Capture the access key information to use as the access key / secret key when configuring 1space credentials.

For more information see Google's migration guide.

../_images/google-storage-cloud-settings-create-key.png

Swift Data Encryption

If Swift encryption is enabled, 1space will decrypt the object contents before storing the content in the remote store.