Please see SSH - Managing Keys and Connections.
We do not have enough IPs to assign every VM its own. Typically, for development we recommend either using ssh port forwarding or tsocks to access the VMs directly. If you need an external IP for a production purpose let us know and we’ll try to accommodate the request.
GlusterFS is a scalable, distributed file system that we use on our clouds to provide file level access to data. Each cloud has it’s own GlusterFS store that is visible from all nodes and VMs. Additionally, the GlusterFS store that contains the OSDC public datasets is readable from all locations.
Your Home folder can be found at /glusterfs/users/<username>. This folder is mounted and accessible from all your virtual machines on the cloud you are working on.
We are providing a shared community resource so there are default quotas for storage and number of cores on each cloud for new users. If you require more resources for a specific project we can work with you to increase these quotas.
Please contact us and we can set up a folder where you place your public data for the community to use.
Transferred data should go to your home directory or a shared directory previously configured for a group project.
Depending on your pipeline the software may need to be installed on all of the nodes and will definitely need to be installed on the compute nodes. A good way to do this is to start a VM and install the packages you need using apt or under /usr/local/bin and then creating a snapshot of that VM. Then select that image when launching your cluster for both the headnode and compute nodes.
You can go to the snapshot section of our instance page to learn more, but in short, snapshots are ways to share and save packages you’ve installed on instances for later use. We’re currently working on setting up methods for users to add additional metadata so that you and other OSDC users can understand what types of packages are installed and what type of analysis was conducted with said VM.
The OSDC is a publicly shared resource, and supports a wide variety of researchers from a number of different scientific disciplines. When you have instances that are not in use, but are not terminated, those cores are still reserved for your idling instances. That prevents other researchers from using those cores. Note: Suspending images still keeps those cores reserved and will continue to be counted in metering. Terminating images not in use is definitely the best practice.
Please email support@opensciencedatacloud.org for the fastest response.
The Bionimbus PDC is a HIPAA compliant cloud for analyzing and sharing protected data. The Bionimbus PDC is an OpenStack cluster utilizing ephemeral storage in VMs with access to a separate S3 compatible storage system for persistent data storage.
Please review the PDC introduction and consult the Bionimbus PDC FAQ to understand access requirements.
As part of the security certification process, the decision was made to not allow full root access on the VMs. However, there is sudo access to install packages with apt and if you require privileged access we will gladly work with you to provide the access you need.
All the VMs use an http_proxy that filters content based on a whitelist we maintain. If you need access to a specific resource, please contact us and we can easily add it to the whitelist.
Contact us at support@opensciencedatacloud.org. This will create a ticket we can track and a member of our support team will review and contact you as soon as possible.