Downloading Your GEDCOM Genealogy Photos From MyHeritage

I've been thinking of moving my family's genealogy website from MyHeritage. The site has a large subscriber base, and lets you build a family tree collaboratively while sharing photos.

Feeling Locked-In

It's the latter that's become a sore point. After four years of fee increases, MyHeritage seems to be spending nothing on improving multimedia, beyond an iPhone app. The user interface for managing albums remains klunky. You can't download a larger album all at once; you have to download its photos individually. The workaround is to copy subsets of photos to a smaller, second album. Similarly, I've seen no improvement in video, user administration, or any other content.

In an age with dozens of free social sharing options, how can they survive? I get the impression that MyHeritage and Ancestry.com have succeeded in buying up their closest competitors. They're primarily genealogy search engines. The website features are mainly there to attract customers to these other services. They feel confident enough that their customers are locked-in to raise prices.

Downloading My Files

To give me the freedom to cancel my subscription, I need to download all family history and photos. There are three sets of files I know of:

  • GEDCOM file (family tree data)
  • Photos associated with each person in the GEDCOM file
  • Photo albums (uploaded by family members)

The GEDCOM file can be downloaded easily from the administrator account (other users won't even see this option). The photo albums can also be downloaded with a little effort. Most can be downloaded as a single .zip file that bundles up all the images. As noted above, you'll need to download larger albums in chunks. I copied them in batches of 50-60 photos to "new" albums, then downloaded those smaller albums.

The photos inside the GEDCOM file are the issue. Although there is an option to download the individual photos along with the GEDCOM file, my export didn't contain them. Perhaps I neglected to checkmark something. If you have the same problem, you'll need to do a small amount of programming to retrieve them. Open the .ged file and search for lines like this:

2 FILE http://www.myheritageimages.com/D/storage/site999999999/files/99/99/99/999999_xxxxxxxxxxxxxxxxxxxxxx.JPG

These are the URL's for each image. The "9" will be a number. The "x" may be a number or letter. Regardless, one way to gather the URL's into a single file using Linux is:

grep myheritageimages "" | cut -b 8- > myheritage-urls.txt

Then you can download them one at a time, using a utility like wget:

for i in `cat myheritage-urls.txt`; do wget `echo $i | tr -d '\r'`; done

Looking For An Alternative

I'll admit, it's tough to find another service that combines all the features I want. As mentioned above, the competition has pretty much disappeared. I can find far better web-based social/photo sharing sites for my family, but they don't include genealogy features. Conversely, I can find well-reviewed genealogy software, but it's designed for offline (desktop) use. You can export files to the web, but the data is static. It's not the same as having a live family tree where others can drop in, add and change details.

After a lot of searching, I've purchased a popular web-based genealogy application called The Next Generation of Genealogy Site Building, or "TNG" by its users. It's designed for the user to manage their own website, and can be installed on any shared hosting service that supports PHP. The software has a long-time developer and an active community forum. It costs 25% of a one-year MyHeritage subscription, and I only have to pay that fee once unless I upgrade.

Will it be enough? For genealogy, I can already tell you that it works well. I was able to import my MyHeritage GEDCOM and photos easily. Site administration is a snap.

The challenge will be adding the content management and multimedia features. TNG can integrate with other CMS applications, but that work is on you. Some users have successfully installed it so that it appears seamlessly within their Wordpress sites. That's my next goal. If I succeed, I'll have access to many Wordpress plugins. And I'll be able to provide my family the experience MyHeritage won't.

Categories: SaaS

Tags: genealogy

Comments: No comments yet

Beyond Passwords: 5 Ways To Prevent Unauthorized Logins

Hard-to-guess passwords are important to securing your network from intrusion. But think about it from the perspective of someone (or their robot app) trying to break in. They need to also pick the right user name and point of entry, and have enough opportunity and patience to repetitively guess until they hit on the right combination. You'll quickly see that other measures are just as significant -- and easier to implement on your OpenSSH server.

Of course, these steps won't make your site bulletproof. They will make it less appetizing to hackers than sites with an open door policy. That's a competition you should keep in mind. There's a joke where two backpackers are pursued by a grizzily bear. In the middle of flight, one suddenly sits down and calmly starts changing from hiking boots into track shoes. His partner yells, "What are you doing? You know you can't outrun a grizzily." The man replies,"I don't have to outrun him. I just have to outrun you."

1. Restrict SSH Users and Privileges

Some of your users only need to transfer files. Why risk giving them SSH accounts, which could be hijacked to run commands on your server? Assign the people who really need ssh access to a group, say "ssh". (This group is already created for you in Ubuntu installs.) Then edit your OpenSSH config file /etc/ssh/sshd_config and add these lines:

AllowGroups ssh

Now only members of the ssh group will be allowed remote access. To restrict local access as well, see my post on RSSH.

2. Change the SSH port.

Security through obscurity. Every kiddie script is going to check port 22 first. Choose another port at a much higher number, and add a line to your sshd_config file, for example:

Port 13072

This measure will greatly reduce the number of failed login attempts in your system logs. It won't stop the more serious attempts, which scan the higher port ranges.

3. Fail2Ban

Fail2Ban is an ingenious app that counts failed login attempts. After a few consecutive failures, it modifies your iptables settings to drop all packets from that IP for a specified period of time (default is 10 minutes). With fail2ban, you don't have to fear brute force attacks as much. It would take weeks to try a few thousand combinations. Moreover, because packets are dropped, the attacker gets no feedback on what happened. For all they know, you're running an expensive hardware router that just kicked into high gear. They might try again later; chance are they'll move on to other sites, in search of easier prey.

Installation is as easy as:

sudo apt-get install fail2ban

Then edit /etc/fail2ban/jail.conf. Here, I've changed the defaults in the [ssh] section to allow 4 retries before lockout, and not monitor logins from localhost and another IP address (replace www.xxx.yyy.zzz with your workplace IP). I also changed the SSH port to match my OpenSSH config from above.

[ssh]
#ignroeip = 127.0.0.1
ignoreip = 127.0.0.1 www.xxx.yyy.zzz
#maxretry = 3
maxretry = 4
#port   = ssh
port    = 13072

4. SSH key exchange

The methods discussed above don't change the fact that we're using traditional password authentication. The user can log in from any computer, anywhere in the world. That means anyone can try to impersonate you by guessing your username and password. A popular way to restrict access to certain computers is SSH key exchange. You generate a pair of cryptographic keys, public and private, on the client, then copy the public key to the OpenSSH server.

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub username@server.mydomain.com

You'll be prompted for a passphrase (optional) when you run ssh-keygen. The ssh-copy-id command is a shortcut that copies and appends the key to the user's list of authorized keys. If it doesn't work, you can accomplish it in two steps. On the client, copy the file over:

scp ~/.ssh/id_rsa.pub username@server.mydomain.com:~/my_remote_id_rsa.pub

On the server, append the key:

cat ~/my_remote_id_rsa.pub >> ~/.ssh/authorized_keys

Now when you login with ssh username@server.mydomain.com, the host should no longer prompt you for a password. It "knows" that it's you because two things happened invisibly during authentication, First, you sent it your public key again, and it confirmed that it matched the one in authorized_keys. Second, your computer sent an encrypted message using its private key, and the message was successfully decrypted by the server using its copy of the public key-- something that could only happen if you also possessed the correct private key. An additional benefit of using keys: your password was never typed in, exposing it to theft by another computer.

So far, we've made login more convenient, using public key authentication in lieu of passwords. The server still accepts traditional logins from any computer. You can instruct OpenSSH to only allow access from authorized keys, by modifying /etc/sshd_config.

PasswordAuthentication no
ChallengeResponseAuthentication no

If we stop here, we probably have made our server too secure. What happens if we misconfigured OpenSSH, lose our key, or have to do an occasional login from an unauthorized computer? We'd be locked out -completely -- a real problem if our server isn't located nearby, and we can't walk over to login locally from the keyboard. Let's add some exceptions to the policy, for cases where we're logged in as ourselves from our private network, or from a trusted remote computer.

Match Address 192.168.0.0/24 User paul
  PasswordAuthentication yes
Match Host my_ddns_hostname.dyndns.org User paul
  PasswordAuthentication yes
# Uncomment these lines to allow your login from any computer.
# Match User paul
# PasswordAuthentication yes

Again, unless you have physical access to your server, I recommend that you give yourself a back door. Enable PasswordAuthentication for your own user account as shown above. Or create another account with a hard-to-guess name and a strong password. Restart the server with your new configuration (/etc/init.d/ssh reload) and test.

5. One Time Passwords

While SSH key exchange works well, it has to be installed in advance and carried with you. There are times when it's not available to you. You may need to access your server from someone else's computer, say at a hotel or kiosk. You're worried about password theft, perhaps by malware like a [key logger] (http://en.wikipedia.org/wiki/Keystroke_logging). For these situations, an alternative to SSH key exchange is One Time Passwords, aka OPIE. This involves setting up an application on the server to pre-generate a list of passphrases. Each consists of six words and is good for a single login only. Even if some rogue application records the password, they can't re-use it. The idea is that you'll carry a list of passwords with you and use them, one at a time.

I haven't tried OPIE, and it doesn't seem to have many adopters. Perhaps it's seen as inconvenient or unnecessary in an age where people carry computers in their pockets. However, if you're traveling and forced to login with someone else's device, it seems like a good choice.

Resources

Categories: Linux Administration

Tags: security, ssh

Comments: No comments yet

RSSH: A More Secure Way To Share Files

You've set up a server and your users are happily transferring files with apps like Filezilla or Cyberduck. You know that plain FTP is not secure, so you're requiring secure FTP (SFTP). Very nice so far -- no FTP server required.

Only problem is, your users also have SSH access, and that creates a potential security vulnerability. Their accounts could be used (or exploited) to run commands, when all you want to allow them is uploading and downloading files. One way to prevent such misuse is RSSH, a restricted form of SSH that limits users to a few commands like sftp, and rsync.

Installing RSSH

Install rssh:

sudo apt-get install rssh

Edit /etc/rssh.conf, and uncomment the commands you want to allow (by default, users are locked out completely):

allowscp
allowsftp
allowrsync

Configuring User Shells and Access

Our RSSH users can't do anything without user accounts, plus logon permissions through the same OpenSSH server used for normal remote logons. For the user accounts, we're going to make the OpenSSH configuration easier by assigning users to one of two groups: the ssh group will have full login access with the default shell (bash); the rssh group will run the RSSH shell. We'll start by creating the latter group:

sudo groupadd rssh

The ssh group should already exist if you're running Ubuntu Linux. If not:

sudo groupadd ssh

Add users to these groups with your favorite user management tool. For example, to add the user John Smith with login jsmith to the rssh group, the command is sudo usermod -G rssh -a jsmith. (Don't forget to also add yourself as an unrestricted user, with sudo usermod -G ssh your_user_name.) You should then see two entries for jsmith in the /etc/group file: one was created for his original login, the other for his membership in the rssh group. As you add other users to the rssh group, you'll see them appended to this line, with commas.

jsmith:x:1001:
rssh:x:1002:jsmith

Now restrict each of the users to the RSSH shell by running: sudo usermod -s /usr/bin/rssh username, replacing username with their login. The last field in the /etc/passwd file should reflect this change:

jsmith:x:1001:1001:John Smith,,,:/home/jsmith:/usr/bin/rssh

Configure your SSH server (OpenSSH) to allows logins from only these groups. Edit /etc/ssh/sshd_config and add an AllowGroups line:

# Comment out any AllowUsers line, because it will override AllowGroups
# AllowUsers jsmith
AllowGroups ssh rssh

Be careful when editing sshd_config on a remote server. A single typo could lock you out, even though you still have a valid password. I recommend you test locally first, and check for conflicting AllowUsers directives before deploying elsewhere. Make sure you have a backdoor way to access your server if you misconfigure OpenSSH.

Restart the ssh server, and you can begin testing, using either /etc/init.d/ssh restart or sudo service ssh restart.

Test by logging in as each type of user. When someone assigned to the rssh group logs in, a message informs him about the restriction:

This account is restricted by rssh.
Allowed commands: scp sftp rsync 

Creating Jails and Dropboxes

The configuration so far limits what commands the rssh users can run. However, it doesn't restrict their ability to view files throughout the filesystem. For example, try connecting with an FTP client, and you'll see that rssh users still have the ability to transfer files from any directory to which they have read permissions.

Take a moment to look at the other configuration options and examples in /etc/rssh.conf. If you want to chroot individual users or permit them only a subset of commands, this is one place to do it. Be forewarned that a chroot environment involves more expertise than our simple setup.

A better place to enforce a chroot environment may be the OpenSSH configuration, as in this blog article. Say you want users to be able to download and share files, similar to a commercial service like Dropbox. You could create a common chroot for everone, like /usr/local/dropbox. Then create individual user directories within it, such as /usr/local/dropbox/home/jsmith. Set permissions to ensure whatever privacy you need, and you have a pretty secure place to store and exchange files.

Resources

Categories: Linux Administration

Tags: security, ssh

Comments: No comments yet