Architecture for merging multiple user accounts together

I am faced with the exact same task at the moment. The design I worked out is rather simple, but it works well.

The core idea is that models for a local site identity and the third-party site identities are kept isolated, but are later linked. So every user that logs into the site has a local identity which maps to any number of third-party site identities.

A local identity record contains a minimum of information - it could even be a single field - just a primary key. (For my application, I don't care about the user's email, name, or birth date - I just want to know they're the person who has been logging into this account all along.)

The third-party identities contain information relevant only to authenticating with a third-party. For OAuth, this typically means a user identifier (like an id, email, or username) and a service identifier (indicating what site or service was authenticated with). In other parts of the application, outside of the database, that service identifier is paired with a method for retrieving the relevant user identifier from that service, and that is how authentication is performed. For OpenID, we employ the same approach, except the method for authenticating is more generalized (because we can almost always perform the exact same protocol - except we use a different identity URL, and that is our service identifier).

Finally, I keep a records of which third-party identities are paired to what local identity. To generate these records, the flow looks like this:

A user logs in for the first time using a third-party identity. A local identity record is created, then a third-party identity record, and then they are paired.
In a control panel, the user is offered the opportunity to link an account by logging in to third-party services. (Pretty straightforward how this works.)
In the scenario where the user unwittingly makes multiple accounts, the solution is pretty simple. While the user is logged in on one of the accounts, he logs into another which he previously used to log into the site (via the control panel feature above). The web service detects this collision (that the local identity of the logged-in user differs from the local identity that is linked to the third-party identity that just logged in) and the user is prompted with an account merge.

Merging accounts is a matter of merging each individual field of the local identity (which will vary from application to application, and should be easy if you have only a couple fields in your local identity records), and then ensuring the linked third-party identities are linked to the resultant local identity.

I tend to find a lot of sites merging based on email as the overlapping joining factor.

I can see this being a viable option, but again it depends on your preference on how to merge. Email Address is the primary way people use to verify some important information change on your site like, changing your password, termination of service, account balance is low etc... It's almost like the web's social security number system but with the ability of communication. Culturally: I think it's reasonable to assume that an email is a pretty unique identity across OAuth authentication services. Granted, it's what the login forms for Facebook and Google ask for.

My current thought process.

The login page has 3 options

Your own site's membership
Login with facebook
Login with google

1) User logins for the first time: trigger a registration flow where an account is created and populated for the first time.

 if the user logins using Facebook (or whatever 3rd party login)
      1) call the Facebook api asking for their information (email, name, etc...) 
      2) create an account membership entry in your database somewhat like this 

         Table = Users
         [ UserId   |       Email             | Password ]
         [    23     | "[email protected]" |  *null*  ]

      3) create an external auths entry like so
         *ProviderUserId is the unique id of that user on the provider's site

         Table = ExternalAuths
         [ ExternalAuthId  |  User_UserId   | ProviderName |   ProviderUserId  ]
         [    56           |      23        |   Facebook   |  "max.alexander.9"]

 if the user wants to create an account with your own registration it would just be this           

         Table = Users
         [ UserId   |       Email           |   Password  ]
         [    23     | [email protected] |  myCoolPwd  ]

2) At some other time, the user comes back but decides to click on the Google Login

      1) call the Google api asking for their information (email, name, etc...) 

      2) once you get the email, match it up to the userId entry with the existing email 

      3) create an additional External auth entry as such

         Table = ExternalAuths
         [ ExternalAuthId  |  User_UserId   | ProviderName |   ProviderUserId  ]
         [    56           |      23        |   Facebook   |  "max.alexander.9"]
         [    57           |      23        |    Google    |  "1234854368"     ]

3) Now you've merged on the account that you trust the email on your database entries are the same as the ones you trust from the external logins.

So for subsequent logins

So what if you have external logins first and then you want a user to be able to login with a password later on?

I see two easy ways to do this

On any first login when an account is created from an external auth, ask them for a password to complete their first entry into your application
If they already have registered using facebook or google first then somehow wanted to register using your own site's registration form. Detect if the email address that they entered already exists, ask them for a password, and send them an email confirmation after the registration is complete.

I've gone through this with sled.com. There are multiple issues here with regard to creating accounts and supporting multiple third-party accounts for login. Some of them are:

Do you need to support both a local password and third-party logins?

For sled.com, I've decided to drop local password due to the small value it adds and the additional cost in securing a password entry form. There are many known attacks for breaking passwords and if you are going to introduce passwords you must make sure they are not easy to break. You also need to store them in a one-way-hash or something similar to prevent them from being leaked.

How much flexibility do you want to allow in supporting multiple third-party accounts?

It sounds like you already chose the three login providers: Facebook, Twitter, and LinkedIn. That's great because it means you are using OAuth and working with a well-defined set of trusted providers. I'm no fan of OpenID. The remaining question is if you need to support multiple third-party accounts from the same provider (e.g. one local account with two Twitter accounts linked). I'm assuming no, but if you do, you will need to accommodate that in your data model.

For Sled, we support login with Facebook, Twitter, and Yahoo! and within each user account store a key for each one: { "_id":"djdjd99dj", "yahoo":"dj39djdj",twitter:"3723828732","facebook":"12837287"}. We setup a bunch of constrains to ensure that each third-party account can only be linked to a single local account.

If you are going to allow multiple accounts from the same third-party provider you will need to use lists or other structures to support that, and with that, all the other restrictions to ensure uniqueness.

How to link multiple accounts?

The first time the user signs-up for your service, they first go to the third party provider and come back with a verified third-party id. You then create a local account for them and collect whatever other information you want. We collect their email address and also ask them to pick a local username (we try to pre-populate the form with their existing username from the other provider). Having some form of local identifier (email, username) is very important for account recovery later.

The server knows this is a first time login if the browser does not have a session cookie (valid or expired) for an existing account, and that the third-party account used is not found. We try to inform the user that they are not just logging-in, but are creating a new account so that if they already have an account, they will hopefully pause and login with their existing account instead.

We use the exact same flow to link additional accounts, but when the user comes back from the third party, the presence of a valid session cookie is used to differentiate between an attempt to link a new account to a login action. We only allow one third-party account of each type and if there is already one linked, block the action. It should not be a problem because the interface to link a new account is disabled if you already have one (per provider), but just in case.

How to merge accounts?

If a user tried to link a new third-party account which is already linked to a local account, you simply prompt them to confirm they want to merge the two accounts (assuming you can handle such a merge with your data set - often easier said than done). You can also provide them with a special button to request a merge but in practice, all they are doing is linking another account.

This is a pretty simple state machine. The user comes back from the third-party with a third-party account id. Your database can be in one of three states:

The account is linked to a local account and no session cookie is present --> Login
The account is linked to a local account and a session cookie is present --> Merge
The account is not linked to a local account and no session cookie is present --> Signup
The account is not linked to a local account and a session cookie is present --> Linking Additional account
- How to perform account recovery with third-party providers?

This is still experimental territory. I have not seen a perfect UX for this as most services provide both a local password next to the third-party accounts and therefore focus on the "forgot my password" use case, not everything else that can go wrong.

With Sled, we've opted to use "Need help signing in?" and when you click, ask the user for their email or username. We look it up and if we find a matching account, email that user a link which can automatically log them into the service (good for one time). Once in, we take them directly to the account linking page, tell them they should take a look and potentially link additional accounts, and show them the third-party accounts they already have linked.

Both approaches for auto merging accounts leaves a pretty big vulnerability that would allow someone to take over an account. They both seem to make the assumption that the user is who they say they are when they offer the merge option to a registering user.

My recommendation for mitigating the vulnerability is to request the user authenticate with one of the known Identity Providers prior to performing the merge to verify the identity of the user.

Example: User A registers with Facebook identity. Sometime later they go back to your site and attempt to access with Windows Live ID and start the registration process. Your site will prompt User A with... It looks like you have registered with Facebook previously. Please login with Facebook (provide link) and we can merge your Windows Live ID with your existing profile.

Another alternative is to store a shared secret(password/personal question) on the initial registration that the user must provide when merging identities, however this gets you back into the business of storing shared secrets. It also means you have to handle the scenario where the user does not remember the shared secret and the workflow that goes along with it.

Most of the posts are quite old and I guess Google's free Firebase Authentication service wasn't yet around. After verifying with OAuth, you pass the OAuth token to it and get a unique user id which you can store for reference. Supported providers are Google, Facebook, Twitter, GitHub and there is an option to register custom and anonymous providers.

Architecture for merging multiple user accounts together

Related

Recent Posts