Product Solutions Support
Try MemSQL

Quick Start with Amazon Webservices

The fastest way to deploy MemSQL is on the Amazon Web Services (AWS) platform on EC2 instances using cloudformation.memsql.com. The only prerequisite here is that you have an existing AWS account. If you do not want to use cloudformation.memsql.com, you can also create a MemSQL single-instance cluster (i.e. master aggregator and leaf node in a single machine) quickly through the AWS Marketplace. Finally, if you cannot use cloudformation.memsql.com or the AWS marketplace, you can also set up MemSQL on AWS manually. All three scenarios are covered in this quick start tutorial.

Using cloudformation.memsql.com

The fastest way to deploy MemSQL is on the Amazon Web Services (AWS) platform on EC2 instances using cloudformation.memsql.com. The only prerequisite here is that you have an existing AWS account.

  1. Go to cloudformation.memsql.com

  2. Click Configure and launch cluster on AWS which will bring up the MemSQL cluster generator, which will bring up a webpage similar to the screenshot below: image

  3. Fill in the necessary fields on the form, which include:

    • MemSQL license key
    • Community Edition users are not required to provide a license key
    • Your Amazon EC2 KeyPair name
    • See the Amazon documentation for generating your own keypair.
    • The EC2 key pair must be associated with the availability zone where you plan to launch your cluster.
      Info

      You may notice that several fields in cloudformation.memsql.com are pre-populated. For first time users, we recommend sticking with those default values provided.

      Amazon Elastic Block Storage (EBS) is used for backups and snapshots. MemSQL CloudFormation Generator requires that you select at least 64GB of EBS.

  4. Click the Generate Template button

  5. In the following screen, look for the CloudFormation template generated. You should see a URL that leads to a downloadable JSON file. image

  6. Choose your desired Amazon EC2 availability zone and select Launch Cluster. This will take you to the AWS website where you can use the CloudFormation template to fully launch your cluster.

  7. After the cluster is launched, you will be redirected to the CloudFormation summary page. Click the refresh button in the top right corner. You can use this page to monitor your cluster.

  8. Once your cluster is up and running, SSH to the master aggregator using the user “ubuntu” and the keypair you specified in the config. From the master aggregator you can use the memsql command line tool to open a SQL REPL.

    $ ssh -i <path to EC2 private key> ubuntu@<aggregator hostname>
    $ memsql
    
  9. Once logged in with the MySQL client, you can run DDL and DML commands.

    Warning

    In order to connect to your cluster from outside you must first modify the security group to open port 3306 to your IP. The security group can be modified through the AWS Console.

  10. Verify that connecting to http://:9000 brings you to the latest Ops. If it does not, ensure that your AWS settings allow you to expose port 9000. You can now start creating databases and querying MemSQL! Here’s a set of sample commands to get you started. You can copy-paste these directly into your MemSQL session.

    -- create a new user
    grant all on *.* to 'myuser'@'%';
    -- create a new database
    create database mymemsqldb;
    show databases;
    use mymemsqldb;
    -- create a new rowstore table
    create table mytable (id INT PRIMARY KEY, name varchar(10));
    -- create a new columnstore table
    create table mycolumnstoretable (id INT, groupid INT, name varchar(10), key (groupid) using clustered columnstore, shard key (id));
    show tables;
    describe mytable;
    describe mycolumnstoretable;
    -- perform data query operations
    insert into mytable values (1, 'row');
    select * from mytable;
    insert into mycolumnstoretable values (1, 2, 'column');
    select * from mycolumnstoretable;
    select mytable.name, mycolumnstoretable.name from mytable inner join mycolumnstoretable on mytable.id = mycolumnstoretable.id;
    -- determine which memsql version you are using
    select @@memsql_version;
    

Using the AWS Marketplace

Launching a single-instance cluster on Amazon AWS Marketplace is simple and fast.

Steps to Launch MemSQL through the AWS Marketplace

1. Choose Community or Enterprise Edition

There are two editions available on the marketplace: Community and Enterprise. Both are “cluster in a box” configurations, including all the components of a MemSQL cluster on a single virtual machine.

MemSQL Community Edition has no software charges and comes with community support. MemSQL Enterprise Edition is priced by system RAM, on an hourly basis, and comes with enterprise features such as replication for high availability, and phone and email support.

2. Launch with 1-Click

You can go immediately from here to “1-Click” launching with the default settings. By default MemSQL will launch on an m3.2xlarge instance (4 CPU cores and 15GB of RAM), with access to ports 22 (ssh) and 80 (MemSQL Ops UI) open, and the first SSH key on your list. It will launch into your default AWS region, for example “US East (N. Virgina)”.

Once the instance is provisioned, you can access the MemSQL Ops UI in your browser, using the IP address or public DNS name of your instance. The Ops UI user is “memsql” and the password is the AWS instance ID. You can find the instance ID in your EC2 console.

http://<your-instance-ip>

user: memsql
password: <instance-id>
Info

The default port for MemSQL Ops is 9000. However, in the AWS marketplace image of MemSQL, it is configured to listen on port 80, the default HTTP port.

To connect to the MemSQL database with a MySQL-compatible client from a different machine, make sure to open port 3306 in your Security Group. The default MemSQL user is “root”, and the password is the AWS instance ID. You can find the instance ID in your EC2 console.

Optional Settings

Below are instructions for changing the default settings on your instance via the Marketplace.

Choose Region and Instance Type

AWS has several regions throughout the world. The instance types available vary by region. We recommmend the m3 or m4.2xlarge instance for good price/performance. For larger RAM capacity, the R series has a higher ratio of RAM to CPUs. For columnstore workloads, the I and D series come with ample disk storage.

Info

Some newer instance types such as the m4 series are not available under “EC2 Classic” networking, and require a virtual private cloud (VPC) and subnet to launch into.

Choose Security Group

By default MemSQL launches with port 22 (ssh) and 80 (MemSQL Ops) open. The MySQL protocol port 3306 is not publicly accessible. You can choose one of your own security groups, or manually edit the default settings.

Choose a Keypair

By default, your marketplace image will launch into your first keypair in the region, sorted alphabetically. You can choose a new keypair to use from the dropdown list. Make sure you have the private key, otherwise you will not be able to access your image via SSH.

Launch Instance

Click the “Launch with 1-Click” button. Once the instance is provisioned, you can access the MemSQL Ops UI in your browser, using the IP address or public DNS name of your instance. The Ops UI user is “memsql” and the password is the AWS instance ID. You can find the instance ID in your EC2 console.

http://<your-instance-ip>

user: memsql
password: <instance-id>
Info

The default port for MemSQL Ops is 9000. However, in the AWS marketplace image of MemSQL, it is configured to listen on port 80, the default HTTP port.

To connect to the MemSQL database with a MySQL-compatible client from a different machine, make sure to open port 3306 in your Security Group. The default MemSQL user is “root” and the password is the AWS instance ID.

Setting up on AWS Manually

Sometimes you want more configurability than allowed by cloudformation.memsql.com (which leverage CloudFormation templates), or the Amazon AWS Marketplace. If so, you can spin up EC2 nodes the normal way, and follow the standard Quick Start On-Premises guide. However, there are some “gotchas” with AWS that are outlined below.

Setting Correct Security Groups

MemSQL nodes communicate with each other on ports 3306 and 9000. Therefore your security group should be configured, at a minimum, to allow connections to each other on those ports. The easy way to do this is just open up those ports to the world. However, you may not want the world to be able to make a connection directly to MemSQL or to access the MemSQL-Ops dashboard, even though they are easily password protected. In this case specify limit access on these ports using the custom ip option.

If you’d like, you can take advantage of security groups to have a dynamic white-list. When you specify a custom ip in a security group you can, instead of inputing a network mask, put in the name of a security group. This will automatically white-list all EC2 nodes that are assigned to that security group. Start typing the name of your security group in the ip field (sg-) and you will see a pop-up menu that will allow you to select an existing security group. image

Additionally, the deploy process uses ssh to copy MemSQL-Ops binaries to all your nodes, so if you want to take advantage of that, these same considerations apply to port 22, or your SSH port of choice.

Clients connect to your aggregators on port 3306. Make sure you have a rule to allow your clients to do this. Additionally, you as the system owner will want to connect to your nodes on port 9000 to access the dashboard.

Mount Disks

Even if you configured an EC2 node to have an EBS Volume, it will not be mounted when you first start your node. If you configured it to have ephemeral storage, it will probably be mounted in /mnt but you should confirm this. You will need to ensure all nodes have a large disk mounted and make sure MemSQL-Ops is configured to deploy MemSQL to the large disk. MemSQL uses the disk for durability of in-memory tables, caching query plans, logs, and storing data in our on-disk columnar storage format. See Columnstore.

Use the df -h command to check if your disk is mounted on each node. This EC2 Node has a large(ish) ephemeral storage volume mounted to /mnt:

	ubuntu@ip-192-168-128-247:~$ df -h
	Filesystem      Size  Used Avail Use% Mounted on
	/dev/xvda1      7.8G  780M  6.6G  11% /
	none            4.0K     0  4.0K   0% /sys/fs/cgroup
	udev            7.4G   12K  7.4G   1% /dev
	tmpfs           1.5G  332K  1.5G   1% /run
	none            5.0M     0  5.0M   0% /run/lock
	none            7.4G     0  7.4G   0% /run/shm
	none            100M     0  100M   0% /run/user
	/dev/xvdb        37G   49M   35G   1% /mnt

This EC2 Node does not have a large disk mounted:

	ubuntu@ip-10-0-0-96:~$ df -h
	Filesystem      Size  Used Avail Use% Mounted on
	/dev/xvda1      7.8G  780M  6.6G  11% /
	none            4.0K     0  4.0K   0% /sys/fs/cgroup
	udev            7.9G   12K  7.9G   1% /dev
	tmpfs           1.6G  332K  1.6G   1% /run
	none            5.0M     0  5.0M   0% /run/lock
	none            7.9G     0  7.9G   0% /run/shm
	none            100M     0  100M   0% /run/user

If there is no large disk mounted, use the lsblk command to view your available disk devices:

	ubuntu@ip-10-0-0-96:~$ lsblk
	NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
	xvda    202:0    0     8G  0 disk
	└─xvda1 202:1    0     8G  0 part /
	xvdb    202:16   0   256G  0 disk

We have a 256GB Volume, xvdb, that is not mounted. To mount the disk, first you need to see if it needs a file system using file -s:

	ubuntu@ip-10-0-0-96:~$ sudo file -s /dev/xvdb
	/dev/xvdb: data

If you see a result of data then you can make the filesystem. If you already see a filesystem, you can skip this step. To make a filesystem use the mkfs tool:

	ubuntu@ip-10-0-0-96:~$ sudo mkfs -t ext4 /dev/xvdb
	mke2fs 1.42.9 (4-Feb-2014)
	Filesystem label=
	OS type: Linux
	Block size=4096 (log=2)
	Fragment size=4096 (log=2)
	Stride=0 blocks, Stripe width=0 blocks
	16777216 inodes, 67108864 blocks
	3355443 blocks (5.00%) reserved for the super user
	First data block=0
	Maximum filesystem blocks=4294967296
	2048 block groups
	32768 blocks per group, 32768 fragments per group
	8192 inodes per group
	Superblock backups stored on blocks:
		32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
		4096000, 7962624, 11239424, 20480000, 23887872

	Allocating group tables: done
	Writing inode tables: done
	Creating journal (32768 blocks): done
	Writing superblocks and filesystem accounting information: done

Now you are ready to mount the disk. If you choose to mount the disk to /var/lib/memsql, the default memsql directory, there is no need to further specify a custom data directory. First create an empty /var/lib/memsql directory, then mount the EBS volume to it:

	ubuntu@ip-10-0-0-96:~$ sudo mkdir /var/lib/memsql
	ubuntu@ip-10-0-0-96:~$ sudo mount /dev/xvdb /var/lib/memsql
	ubuntu@ip-10-0-0-96:~$ df -h
	Filesystem      Size  Used Avail Use% Mounted on
	/dev/xvda1      7.8G  780M  6.6G  11% /
	none            4.0K     0  4.0K   0% /sys/fs/cgroup
	udev            7.9G   12K  7.9G   1% /dev
	tmpfs           1.6G  332K  1.6G   1% /run
	none            5.0M     0  5.0M   0% /run/lock
	none            7.9G     0  7.9G   0% /run/shm
	none            100M     0  100M   0% /run/user
	/dev/xvdb       252G   60M  239G   1% /var/lib/memsql

After mounting, you should see in df -h that you have a large amount of space free in /var/lib/memsql, so you can proceed with a normal install.

Info

If your large volume is mounted somewhere other than /var/lib/memsql, go on to the next step to configure a custom install directory

Set a Custom Installation Directory for MemSQL

MemSQL installation directories are configured on a node-by-node basis within MemSQL Ops. To change the default MemSQL Installation directory on your first node, use the --memsql-installs-dir option in install.sh:

sudo ./install.sh --memsql-installs-dir=/mnt/memsql/
Info

This controls were MemSQL will eventually be installed. MemSQL-Ops will still be installed to its default location (/var/lib/memsql-ops).

MemSQL will be deployed to /mnt/memsql on this node. You also need to ensure that it is deployed to the correct directory on all your nodes. To do this, when you are deploying MemSQL-Ops to your other nodes using the graphical web interface, be sure to click the “edit” button before deploying! image

Once you click on edit, you can change the MemSQL Install path for one or many nodes to the location of your large volume: image

Was this article useful?