Introduction

In your Mantle Database, data is stored as entities. Each entity in your Mantle Database has a name and a data type. Each data type has a set of required properties and optional properties. You can view all of the entities of a data type as a table in your Mantle Database. Each entity is a row in a table, and the properties are columns of the table. There a multiple data types, such as rnaseq_fastq or image_directory.

Each value of each property is one of:

  • string
  • integer
  • double (float)
  • boolean
  • file

Using the Mantle SDK, you can access information stored in your Mantle Database or create new entities in your Mantle Database.

Interacting with a single data entity

Getting an entity by its unique ID

Get an entity by its unique ID
import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.get("E000001")

This returns a Python object representing the data entity.

Accessing an entity’s name and unique ID

If you got an entity by its unique ID and want to know its name:

Get the name of an entity
import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.get("E000001")

print(entity.name)

If you got an entity some other way (e.g. as an output of a Mantle Pipeline run) and want to know its unique ID:

Get the unique ID of an entity
print(entity.unique_id)

Getting entity properties

To conveniently access the values of all of an entity’s properties, you can represent the entity as a Pandas series:

Get entity properties as Pandas series
import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.get("E000001")

entity_series = entity.to_series()

print(entity_series)

Downloading files from an entity

Files that are properties of entities are stored in Amazon S3. To use the files, you need to download them to your local working environment, whether you are in a Mantle Notebook or using a script on your own computer.

Download files from file properties using the download_s3 method, which takes as arguments the key of the property and the local path to which the file will be downloaded.

Download files from properties
import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.get("E000001")

entity.download_s3("image", "local_image.tif")

Querying for a group of entities and returning a DataFrame

To get a Pandas DataFrame where datasets are represented as rows, you can query for a group of datasets and turn them into a DataFrame.

Query for datasets by data type
import mantlebio

mantle = mantlebio.MantleClient()

entity_list = mantle.dataset.build_query().where(
    "data_type_unique_id=rnaseq_fastq"
).execute()

This returns an iterable of entity Python objects, which can each be interacted with using the SDK. For example, you can loop through and download files from each entity into your working environment.

Additionally, you can turn this iterable into a Pandas DataFrame:

Convert iterable of entities to Pandas DataFrame
df = entity_list.to_dataframe()

Creating a new data entity in your Mantle Database

To create a new entity in your Mantle Database, you must provide a name and a data type, and the required properties (each data type has a set of required properties and optional properties). The properties of the entity are expressed as key-value pairs.

You can check what the properties of a data type are using the Mantle Database UI.
Example: create a new custom_file entity

import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.create(
    name="4XP1 2.5Å crystal structure",
    dataset_type="custom_file",
    properties={
	   "file": {"file_upload": {"filename": "</local/path/to/4xp1.pdb>"}}
    },
    local=False,
)

To test this out yourself, download the PDB file here.

Note that to use a file as a property, you need to pass a nested dictionary that instructs Mantle to upload a file from a local file path to Amazon S3:

properties={
    ...
    "<file_property_key>": {"file_upload": {"filename": "<file_path>"}}
    ...
}

This is different from other properties, where you only need to pass the value.

Example: create a new rnaseq_fastq entity
import mantlebio

mantle = mantlebio.MantleClient()

entity = mantle.dataset.create(
    name="<my_rnaseq_fastq>",
    dataset_type="rnaseq_fastq",
    properties={
        "sample", "<sample_name>",
	    "read1": {"file_upload": {"filename": "<local/path/to/read1.fastq.gz>"}},
        "read2": {"file_upload": {"filename": "<local/path/to/read2.fastq.gz>"}},
        "strandedness": "<strandedness>"
    },
    local=False,
)
The local keyword in Mantle relates to whether you’re asking the system to create a new entity in the database via an API request. In this version of the SDK, local is set to True by default to be consistent with earlier versions. However, for most purposes, setting it to False is appropriate. When set to False, the entity is automatically pushed to Mantle.