ChatGPT解决这个技术问题 Extra ChatGPT

Many to one and many to many LSTM examples in Keras

I try to understand LSTMs and how to build them with Keras. I found out, that there are principally the 4 modes to run a RNN (the 4 right ones in the picture)

https://i.stack.imgur.com/b4sus.jpg

Now I wonder how a minimalistic code snippet for each of them would look like in Keras. So something like

model = Sequential()
model.add(LSTM(128, input_shape=(timesteps, data_dim)))
model.add(Dense(1))

for each of the 4 tasks, maybe with a little bit of explanation.


S
Shaido

So:

One-to-one: you could use a Dense layer as you are not processing sequences: model.add(Dense(output_size, input_shape=input_shape)) One-to-many: this option is not supported well as chaining models is not very easy in Keras, so the following version is the easiest one: model.add(RepeatVector(number_of_times, input_shape=input_shape)) model.add(LSTM(output_size, return_sequences=True)) Many-to-one: actually, your code snippet is (almost) an example of this approach: model = Sequential() model.add(LSTM(1, input_shape=(timesteps, data_dim))) Many-to-many: This is the easiest snippet when the length of the input and output matches the number of recurrent steps: model = Sequential() model.add(LSTM(1, input_shape=(timesteps, data_dim), return_sequences=True)) Many-to-many when number of steps differ from input/output length: this is freaky hard in Keras. There are no easy code snippets to code that.

EDIT: Ad 5

In one of my recent applications, we implemented something which might be similar to many-to-many from the 4th image. In case you want to have a network with the following architecture (when an input is longer than the output):

                                        O O O
                                        | | |
                                  O O O O O O
                                  | | | | | | 
                                  O O O O O O

You could achieve this in the following manner:

model = Sequential()
model.add(LSTM(1, input_shape=(timesteps, data_dim), return_sequences=True))
model.add(Lambda(lambda x: x[:, -N:, :])) #Select last N from output

Where N is the number of last steps you want to cover (on image N = 3).

From this point getting to:

                                        O O O
                                        | | |
                                  O O O O O O
                                  | | | 
                                  O O O 

is as simple as artificial padding sequence of length N using e.g. with 0 vectors, in order to adjust it to an appropriate size.


One clarification: For example for many to one, you use LSTM(1, input_shape=(timesteps, data_dim))) I thought the 1 stands for the number of LSTM cells/hidden nodes, but apperently not How would you code a Many-to-one with lets say, 512 nodes though than? (Because I read something simliar I thought it would be done with model.add(LSTM(512, input_shape=...)) model.add(Dense(1)) what is that used for than?)
In this case - your code - after correcting a typo should be ok.
Why do we use the RepeatVector, and not a vector with the first entry 1= 0 and all the other entries = 0 (according to the picture above, the is no Input at all at the later states, and not always the same input, what Repeat Vector would do in my understanding)
If you think carefully about this picture - it's only a conceptual presentation of an idea of one-to-many. All of this hidden units must accept something as an input. So - they might accept the same input as well input with the first input equal to x and other equal to 0. But - on the other hand - they might accept the same x repeated many times as well. Different approach is to chain models which is hard in Keras. The option I provided is the easiest case of one-to-many architecture in Keras.
Nice ! Iam thinking about using LSTM N to N in a GAN architecture. I will have a LSTM based generator. I will give this generetor (as used in "Latent variable" in gans) the first half of the time series and this generator will produce the second half of the time series. Then I will combine the two halfs (real and generated) to produce the "fake" input for the gan. Do you think using the poin 4 of you soluction will work ? or, in another words, is this (solution 4) the right way to do this ?
g
gustavz

Great Answer by @Marcin Możejko

I would add the following to NR.5 (many to many with different in/out length):

A) as Vanilla LSTM

model = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))
model.add(Dense(N_OUTPUTS))

B) as Encoder-Decoder LSTM

model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES))  
model.add(RepeatVector(N_OUTPUTS))
model.add(LSTM(N_BLOCKS, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear')) 

Could you please explain the details of the B) Encoder-Decoder LSTM architecture? I'm having issues understanding the roles of "RepeatVector" / "TimeDistributed" steps.
Could you please help in how to correctly feed the multidimensional data for many to many or encoder-decoder model? I'm mostly struggling with shape. Let's say we have a total data set stored in an array with a shape (45000, 100, 6) = (Nsample, Ntimesteps, Nfeatures) i.e. we have a 45000 samples with 100 time steps and 6 features.

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now