File movement in and out of Azure

After a busy couple of weeks i finally got round to writing something new up - not strictly directly database related but If you're doing any work with Azure - getting files in and out of it is a key point of understanding. In my case i wanted to copy over some tarred up oracle software trees from on premise that we have fully patched up to date and then put onto azure servers using the script I've blogged about in the distant past

There is more than one way to do this so i've just picked out how i did it using a couple of different tools - this is not an exhaustive list by any means it's just how i chose to go about it this time.

Now in the simplest case you'll have direct access to the servers from your on premise machines - either to a public ip address (which will make your security guys sweat a little) or via 'private peering' - this just allows you to access the Azure private IP's over a dedicated connection using the 'express route' offering from Microsoft.

In this particular case neither of these methods were available - all access to the servers was only via the remoteapp functionality of Azure (basically Citrix) - and in fact Microsoft have pulled this offering for new customers and only Citrix will be available in the future.

Now i could probably have found some way to transfer the files through remoteapp somehow but i didn't go that route - i wanted to try out some of the command lines tools that allow access to the various azure storage offerings.

Now initially i found  the storage offerings a little confusing - but it makes more sense now I've used it a bit. For my use case there are really just 2 choices - copy the tar files into 'blob' storage or into 'file' storage.

There are 2 main differences I'm concerned with (there are others but for now I'm not interested).

1) Blob storage is 1 cent / GB / month whereas File storage is 8 cent/GB/month
2) File storage can be mounted using samba or accessed via REST APIs, blob storage can only be accessed via REST API's

The cost (although cheap in both cases) is substantially different - the overhead of managing proper filesystems etc in the background means more moving parts and more management for microsoft.

Current pricing is here for those interested

Anyway enough of the preamble here - lets just demo how we get some files in and out of this storage.

The first thing i wanted to try (as i already had some of the tar files copied onto a windows box) was the azcopy command line tool - currently only available for windows. There is more info here including the download link. Once installed it can be used to upload/download files. Here is how that is achieved.

"C:\Program Files (x86)\Microsoft SDKs\Azure\AzCopy\azcopy" /source:"D:\oraclesw\" /dest: /destkey:secret-string-from-portal-ending_in== /pattern:

So breaking that down (and ignoring the resume messages in my output )  - the executable is azcopy (PATH shown above is default install path)
/source is the local directory we are uploading from
/dest is the storage file location - in this case we can see from the string that this is a blob location and I'm loading it into the oraclesw container i created there
/destkey is the secret key down on the access keys link in the portal for this storage account
/pattern is then the filename pattern we want to upload - I'm just using an exact filename but wildcards like * work just fine

You can see from the screenshot it gives a summary at the end - in this case the 10GB file took about 3min or so to upload. While it's running you also get shown the current transfer rate.

To load the same file into file storage - the command is exactly the same - all you do is replace blob with file in the url. Here's a quick pic showing that and the point in time transfer rate that it got

So thats all great - i got the file i wanted uploaded into Azure - but now i want to get it onto my linux box that is actually in Azure - so how can i do that

The easiest way to do that for blob storage is with the azcli tool - this has 2 versions currently - the newer one in preview is the one I'm going to use - at first glance functionality is much the same between the two.

The install instructions are here - there is no point repeating any of that as they just worked exactly as the note said they would. The only problem i had was getting our on premise unix team to install some of the pre-reqs for me - anyway that's still in progress but the setup worked fine on my redhat server in azure.

Once it's installed the files from blob storage can be downloaded as follows

First we set some environment variables - first is the storage account we are talking to, second is the secret key we are using.

export AZURE_STORAGE_ACCOUNT=oracle2mssqlfiles
export AZURE_STORAGE_ACCESS_KEY=“secret-key-ending-in==“

Then we login to the command line - this will take you through a login process which i won't describe here.

/opt/azure/az login

Then we set ourselves up against the correct azure subscription (only valid if you have more than one)

/opt/azure/az account set --subscription "sub name here"

Now we can download happily

/opt/azure/az storage blob download -c oraclesw -n -f /oracle/

There is more than one way to specify the account details and do the login so i won't go into details here on the various different methods - the above worked fine for me.

So looking good - i now have the file i wanted on my azure server - but it meant i have to store it in blob storage and also store it locally too.

So what if i can use the file i uploaded to 'file' storage and now cifs mount that?

Thats even easier

So assuming you have a linux version where cifs works properly this is all you need do

 mount -t cifs // --verbose -o vers=3.0,username=yourstorageaccounthere,password=secret-ket-endingin== /mnt
mount.cifs kernel mount options: ip=x.x.x.x,unc=\\\oraclesw,vers=3.0,user=yourstorageaccountnamehere,pass=********

Run that and we have all the files from the file storage under the container oraclesw mounted at /mnt

[root@xx]# df -k
Filesystem                                         1K-blocks     Used  Available Use% Mounted on
/dev/sda2                                           30192228  7008772   23183456  24% /
devtmpfs                                             3558108        0    3558108   0% /dev
tmpfs                                                3567780        0    3567780   0% /dev/shm
tmpfs                                                3567780     8516    3559264   1% /run
tmpfs                                                3567780        0    3567780   0% /sys/fs/cgroup
/dev/sda1                                             508580   119244     389336  24% /boot
tmpfs                                                 713560        0     713560   0% /run/user/1000
/dev/sdc1                                          103079864 26999624   70821028  28% /oracle
// 1048576000  7745088 1040830912   1% /mnt

To confirm we can go and have a look

[root@xx]# cd /mnt
[root@xx]# ls

So in this case the file is not 'duplicated' - it's just stored once and presented in more than one way. You just pay a lot more for that.

So there you have it some basic instructions on getting files in and out of azure storage - there are other GUI tools to help with this (even for linux) but it's always nice to know the command line stuff especially to enable automation and scripting of processes.


  1. Hi Rich,

    I really don't want to be a language terrorist and I know even Oracle is misusing the term but it is "on premises" :)


  2. :-)

    It comes to something when my English is being corrected by a non native speaker.

    I'll now go in the corner and write out 'on premises' 100 times as punishment.....