Best way to handle negative sampling in Tensorflow 2.0 with Keras

Negative Sampling is a technique in which the Values which are not in the Context it simply Samples a Small Number of them instead of Reducing the Values of their Weights.

So, in our Implementation, we use an Activation of "Sigmoid" instead of "Softmax". So, for the words which are in Context, we want our Network to Output 1 and 0 for the words which are not in the Context.

Yes, your observation is correct that we need to use Functional API instead of Sequential API.

Code to implement Negative Sampling in Keras is shown below:

# create some input variables
input_target = Input((1,))
input_context = Input((1,))

embedding = Embedding(vocab_size, vector_dim, input_length=1, name='embedding')

target = embedding(input_target)
target = Reshape((vector_dim, 1))(target)
context = embedding(input_context)
context = Reshape((vector_dim, 1))(context)

# setup a cosine similarity operation which will be output in a secondary model
similarity = merge([target, context], mode='cos', dot_axes=0)

# now perform the dot product operation to get a similarity measure
dot_product = merge([target, context], mode='dot', dot_axes=1)
dot_product = Reshape((1,))(dot_product)
# add the sigmoid output layer
output = Dense(1, activation='sigmoid')(dot_product)

# create the primary training model
model = Model(input=[input_target, input_context], output=output)
model.compile(loss='binary_crossentropy', optimizer='rmsprop')

# create a secondary validation model to run our similarity checks during training
validation_model = Model(input=[input_target, input_context], output=similarity)

For more information, please refer this Awesome Article.

Hope this helps. Happy Learning!