How to store hundreds of millions of simple key-value pairs in Redis

This is about using Redis to store a huge number of data. Original article: http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value.

When transitioning systems, sometimes you have to build a little scaffolding. At Instagram, we recently had to do just that: for legacy reasons, we need to keep around a mapping of about 300 million photos back to the user ID that created them, in order to know which shard to query (see more info about our sharding setup). While eventually all clients and API applications will have been updated to pass us the full information, there are still plenty who have old information cached. We needed a solution that would:

  1. Look up keys and return values very quickly
  2. Fit the data in memory, and ideally within one of the EC2 high-memory types (the 17GB or 34GB, rather than the 68GB instance type)
  3. Fit well into our existing infrastructure
  4. Be persistent, so that we wouldn’t have to re-populate it if a server died

Continue reading How to store hundreds of millions of simple key-value pairs in Redis

Sharding & IDs at Instagram

This post is about sharding and IDs generation at Instagram. Original article: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

With more than 25 photos & 90 likes every second, Instagram store a lot of data. To make sure all of these data fits into memory and is available quickly for users, we’ve begun to shard our data—in other words, place the data in many smaller buckets, each holding a part of the data.

Instagram’s application servers run Django with PostgreSQL as our back-end database. Our first question after deciding to shard out our data was whether PostgreSQL should remain our primary data-store, or whether we should switch to something else. We evaluated a few different NoSQL solutions, but ultimately decided that the solution that best suited our needs would be to shard our data across a set of PostgreSQL servers. Continue reading Sharding & IDs at Instagram

Nginx directory index

Enabling directory listing in a folder in nginx is simple enough with just an autoindex on; directive inside the location directive.

You can enable sitewide directory index by putting it in the server block or even enable directory access for all sites by putting it in the http block.

An example config file:

server {
        listen   80;
        server_name  domain.com www.domain.com;
        access_log  /var/...........................;
        root   /path/to/root;
        location / {
                index  index.php index.html index.htm;
        }
        location /somedir {
               autoindex on;
        }
}

Tags for Nginx directory index

nginx list files in folder
nginx show directory listing
nginx enable directory listing
nginx directory index
nginx list directory
nginx list directory contents

boot.log empty in Redhat 5

Just found that /var/log/boot.log is empty on our database server with In RHEL 5.  So we can’t able to see the start up log for services during the server’s start  to find why Oracle 11g doesn’t load. Quick googling helps to find a workaround with boot log.

You need to edit the “/etc/init.d/functions” file to uncomment 3 lines of code in 4 sections:  success, failure, passed, warning. Continue reading boot.log empty in Redhat 5

Hardening WordPress

Security in WordPress is taken very seriously, but as with any other system there are potential security issues that may arise if some basic security precautions aren’t taken. This article will go through some common forms of vulnerabilities, and the things you can do to help keep your WordPress installation secure.

This article is not the ultimate quick fix to your security concerns. If you have specific security concerns or doubts, you should discuss them with people whom you trust to have sufficient knowledge of computer security and WordPress. Continue reading Hardening WordPress

The name or security ID (SID) of the domain specified is inconsistent with the trust information for that domain

As per Microsoft Active Directory architecture, every object (users, groups & computers) in the domain has a unique identifier, known as a SID. These SIDs are unique alphanumeric strings that correspond to a single object in the domain. When you copy a virtual machine directory, the resultant virtual machine has the same SID as the original virtual machine. When you try joining this new virtual machine to the same domain where the original virtual machine is, Active Directory sees two machines with a single SID and warns you that there is a SID conflict.

Windows SID needs to be changed after copying a virtual machine directory Continue reading The name or security ID (SID) of the domain specified is inconsistent with the trust information for that domain

Cisco configuration backup

There are several methods to back up and restore a configuration of Cisco routers.

Use a TFTP Server to Backup and Restore a Configuration

This is a step-by-step approach to copy a configuration from a router to a TFTP server, and back to another router. Before you proceed with this method, make sure you have a TFTP server on the network to which you have IP connectivity. Continue reading Cisco configuration backup