Importing Embedding matrices


Wanted to ask if there is any way to import embedding matrices, e.g. created from Glove or fasttext embeddings. A more general question would be if there is a possibility to import data into DLS with other formats than .csv (e.g. npy, npz, etc.).

Thanks in advance.

Kind regards,

Hi Theodore,

If your dataset can not be imported with normal method (.csv with values/filenames), then we support a catchall mechanism by taking your dataset in numpy format.

You can first convert your dataset in the numpy format and save each row data in a numpy file (using below code). Create train.csv with one column with numpy filenames and other column for output data.

input_data, output_data
a.npz, label1
b.npz, label2

Following is a sample code of how we save numpy data and load it in DLS.

import numpy as np

a = np.zeros((2, 1))

npzfile = np.load("a.npz")
x = npzfile[npzfile.files[0]]

Does this work for your use case ?

1 Like

Thanks Rajendra,

This makes the whole process easier, for certain models at least.

Will give it a try.

Kind regards,

Hi Theodore, Are you able to import the pretrained wordembeddings like GloVe/Word2Vec? Please let me know the process.

yes for that either save your embedded vector as numpy array or save them into a csv file values separated by semicolon (:wink:
Yes you can import npy file just store the path of the numpy files in the csv similar we do for images.

Hi can you give the link of the example “import npy file just store the path of the numpy files in the csv similar we do for images”

In this article I am saving each 3-d image into a numpy file.
This is the zip file structure

In this dataset I am having 10 categories so 10 different folders.
Inside every folder i am having 1 numpy file for 1 row (sample)

If you follow this file structure then DLS will automatically generate a csv file for your dataset with label as the name of the folders

Below is code to save file as numpy file.

In this labels is the list of 10 folder name
and X is the array in which dataset is stored.

I am trying to upload the Glove Word embedding in the project. But failed to do so. I Clicked on the “Advanced Layer” to add the custom model after clicking the model. Asked to upload the Keras model file with the extension .yaml. May I know how to obtain th e.yaml file from the Glove Word embedding.

I tried to import the Glove embedding file using other format like .npy but failed to upload the file. i used the following script to create the .npy file from the Glove Word embedding.

import numpy as np
import argparse

parser = argparse.ArgumentParser()
parser.add_argument(‘input’, help=‘Single embedding file’)
parser.add_argument(‘output’, help=‘Output basename without extension’)
args = parser.parse_args()

embeddings_file = args.output + ‘.npy’
vocabulary_file = args.output + ‘.txt’
words = []
vectors = []

with open(args.input, ‘rb’) as f:
for line in f:
fields = line.split()
word = fields[0].decode(‘utf-8’)
vector = np.fromiter((float(x) for x in fields[1:]),

matrix = np.array(vectors), matrix)
text = ‘\n’.join(words)
with open(vocabulary_file, ‘wb’) as f:

Please let me know the process to add the custom layer such as Glove Word embedding.

I would like to add embedding layers with the weights of Glove vactior, so what is the procedure, please let me know.