Skip to content

Question regarding discrete model prediction layer activation function and model loss function #121

Description

@ericparakal

For the case of the discrete model, specifically the model definition in the file kdd99_model.py; why is the prediction layer activation function sigmoid and not softmax as the KDD99 problem is a multi-class classification problem?

pred = tf.keras.layers.Dense(n_labels, activation='sigmoid')(net)

Also, why is the from_logits parameter set to True in the SparseCategoricalCrossentropy loss function, if the prediction layer of the model already has a sigmoid activation function?

model_full.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), \
metrics=['accuracy'],
optimizer='adam')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions