Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateTime improve constructor params #591

Open
2 tasks
breznak opened this issue Jul 25, 2019 · 3 comments
Open
2 tasks

DateTime improve constructor params #591

breznak opened this issue Jul 25, 2019 · 3 comments
Labels
encoder enhancement New feature or request newbie

Comments

@breznak
Copy link
Member

breznak commented Jul 25, 2019

DateTime encoder in py/htm/encoders/date.py is a composit of several sub-encoders (ie timeOfDay, dayOfWeek, weeked, ...)

  • Currently, these fields obtain parameter bits which means how many bits would that field activate. (this equals to "improtance" of said field in comparison to others)
  • the total size of the encoder is "unknown" (depends on all of those settings).

For modelling purposes it would be nice to have:

  • fixed size for DateTime encoder (param size= like in RDSE)
  • instead of bits, user would specify improtance of each field ([0.0, 1.0]).
    • total of all importances is normalized to sum to 1.0
    • encoders take-up N bits based on their normalized importance
    • example: size=100, weekend=0.1, timeOfDay=1.0, dayOfWeek=1.0
      • time & day are important, and both equally important
      • weekend is 10x more insignificant
        -> approx: weekend: 6bits, day: 47b, time:47b

It would be much easier to work with such encoder.

  • Q: make size a mandatory param in base Encoder?
@breznak breznak added enhancement New feature or request newbie encoder labels Jul 25, 2019
@ctrl-z-9000-times
Copy link
Collaborator

This is good idea! I think that this should be implemented as a shim function which transforms the importance parameters into a call to the underlying python DateTime encoder.

Q: make size a mandatory param in base Encoder?

The size is always defined for the encoder, once it has been setup & constructed.

@breznak
Copy link
Member Author

breznak commented Jul 26, 2019

glad you like it!

hould be implemented as a shim function which transforms the importance parameters into a call

Can you specify what you mean by this?

Thinking about the problem, I'd like to make this "Importance available for each input/sensor".
I came with the following approaches:

  1. DateTime encoder adds an optional arg DateTime(..., size=0)
  • if specified, bits for subencoders are threated as importance, normalizes so that total size of enc is as requested
  • pro: allows to keep existing API
  • con: only for DateTime
  1. The "importance weighting" should have been done by MultiEncoder,
  • which we decided to remove and use SDR.concatenate() instead.
  • provide new SDR.shrink_to_fit(vector<SDR> sdrs, vector<Real> importance)
    • would select only a weighted portion from sdrs[i].getSparse()
    • pro: generic, easy to implement
    • con: not ideal approach (encodings are not SDRs), selecting a part may break the encoding (actially, now I think this approach would not work as is)
  1. TM,SP have multiple (weighted) inputs (SDRs)
  • realised like Connections' extraInputs
    • instead of having a single input array, have (const number of const sized arrays/SDRs)
    • each has a weight
    • Connections "gives importance" by probability of distributing potential synapses among these inputs.
    • pro: generic, plausible solution
    • Conn::extraInputs could be removed/would be just another input[i]

What do you think, esp of 3)?

@ctrl-z-9000-times
Copy link
Collaborator

Can you specify what you mean by this?

I was thinking you could write a python function like:

def DateEncoder2( weight1, weight2, size = 100 ):
   # transform the args into the current format for arguments...
   return DateEncoder( numBits1, numBits2 )

TM,SP have multiple (weighted) inputs (SDRs)

I think that would add a lot of complexity to the SP & TM. I'd rather just concatenate the inputs into one big input.

Connections "gives importance" by probability of distributing potential synapses among these inputs.

I had a similar idea where the Connections class did this, except instead of needing to hard code all of the rules and probabilities, they were learned from the synapse data. However i never got around to figuring out how to implement it efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
encoder enhancement New feature or request newbie
Projects
None yet
Development

No branches or pull requests

2 participants