Skip to content

Latest commit

 

History

History
381 lines (297 loc) · 19.3 KB

2018-09-07-ARTS.md

File metadata and controls

381 lines (297 loc) · 19.3 KB

Week 7 ARTS


[A] - LC: 398. Random Pick Index


package leetcode;

import java.util.*;

/**
 * 398. Random Pick Index
 *
 * Given an array of integers with possible duplicates, randomly output the index of a
 * given target number. You can assume that the given target number must exist in the array.
 *
 * Note:
 * The array size can be very large. Solution that uses too much extra space will not pass the judge.
 *
 * Example:
 *
 * int[] nums = new int[] {1,2,3,3,3};
 * Solution solution = new Solution(nums);
 *
 * // pick(3) should return either index 2, 3, or 4 randomly. Each index should have equal
 * probability of returning.
 * solution.pick(3);
 *
 * // pick(1) should return 0. Since in the array only nums[0] is equal to 1.
 * solution.pick(1);
 */

public class RandomPickIndex398 {
  private Map<Integer, List<Integer>> map;

  /**
   * Solution here is to use HashMap record all the indexes,
   * then get random from target number list
   */
  public RandomPickIndex398(int[] nums) {
    map = new HashMap<>();
    for (int i = 0; i < nums.length; i++) {
      List<Integer> curr = map.containsKey(nums[i]) ? map.get(nums[i]) : new ArrayList<>();
      curr.add(i);
      map.put(nums[i], curr);
    }
  }

  public int pick(int target) {
    Random random = new Random();
    List<Integer> idxs = map.get(target);
    return idxs.get(random.nextInt(idxs.size()));
  }
  public static void main(String[] args) {
    RandomPickIndex398 test = new RandomPickIndex398(new int[]{1, 2, 3, 4, 3, 3, 5, 6, 5, 4,3});
    System.out.println(test.pick(3));
    System.out.println(test.pick(3));
    System.out.println(test.pick(4));
    System.out.println(test.pick(5));
    System.out.println(test.pick(4));
  }
}
package leetcode;

/**
 * 395. Longest Substring with At Least K Repeating Characters
 *
 * Find the length of the longest substring T of a given string (consists of lowercase
 * letters only) such that every character in T appears no less than k times.
 *
 * Example 1:
 *
 * Input:
 * s = "aaabb", k = 3
 *
 * Output:
 * 3
 *
 * The longest substring is "aaa", as 'a' is repeated 3 times.
 * Example 2:
 *
 * Input:
 * s = "ababbc", k = 2
 *
 * Output:
 * 5
 *
 * The longest substring is "ababb", as 'a' is repeated 2 times and 'b' is repeated 3 times.
 */


public class LongestSubstringwithAtLeastKRepeatingCharacters395 {
  /**
   * solution here is hinted by LC discussion Nakanu (using divide and conquer (recursive)
   * First find out the frequency less than k.
   * From that point, divide array into left and right two parts.
   * Calculate the longest substring, then compare left and right, return longest one.
   * Continue divide until it didn't find the frequency less than k or the frequency of
   * the substring characters are all less than k.
   */
  public int longestSubstring(String s, int k) {
    if (k == 1) return s.length();
    char[] str = s.toCharArray();
    return helper(str,0,s.length(),k);
  }
  private int helper(char[] str, int start, int end,  int k){
    //substring length shorter than k.
    if (end - start < k) return 0;
    int[] count = new int [26];
    // record each element frequency
    for (int i = start; i < end; i++) {
      count[str[i] - 'a']++;
    }
    for (int i = 0; i < 26; i++) {
      //count[i]=0 => i+'a' does not exist in the string, skip it.
      if (count[i] < k && count[i] > 0) {
        for (int j = start; j < end; j++) {
          // found frequency less than k and the position of the element, divide from this position
          if (str[j] - 'a' == i) {
            int left = helper(str, start, j, k);
            int right = helper(str, j + 1, end, k);
            // always return the longest one
            return Math.max(left, right);
          }
        }
      }
    }
    return end - start;
  }

  public static void main(String[] args) {
    LongestSubstringwithAtLeastKRepeatingCharacters395 test =
        new LongestSubstringwithAtLeastKRepeatingCharacters395();
    test.longestSubstring("ababbc", 2);
    test.longestSubstring("aaaaaaaa", 2);
    test.longestSubstring("aabdbccabbc", 2);
    test.longestSubstring("ababbcb", 3);
    test.longestSubstring("", 2);
    // this case, we can add the case to handle, return the original string length.
    test.longestSubstring("ababbc", 1);
  }
}

[R] - Design a Shortening URL service (TinyURL)


understand why we need shortening URL

  1. Gather requirements
  • functional requirements
  1. Need an algorithm to generate a short URL for giving URL
  2. Service need to redicrect to the orinial URL when giving a short URL
  3. User can specify an expiration of the shorten URL when creating it
  4. Users can customize the short URL when creating it
  5. Manual remove URL
  6. UI vs APIs
  • Non functional requirements
  1. This system must be highly available, meaning we cannot torelate the wrong answer, we must redirect user to original url or return error when users give a short url.
  2. redirect to the original url should happen in real-time with minimum latency
  3. shortened urls should not be guessable or predictable, meaning we need an algorithm to generate a unique short urls.

Estimate the capacity (using data to support your calculation) QPS estimation

daily users: 100M users per day

To be continued...

References:

[1] educative design URL

[2] System Design Tiny URL

[T] - Yoga Teacher Training Experience Sharing (First week)


Accidently I signed up Yoga Teacher Traning Certificate class. Completed first week training, want to share some insights what I got from this first 10 hours training.

It is a totally different experiences than you practice yoga with teacher (offline or online). How it is different and why it is so different?

I thought we will have a lot of practice during the class, but on the contrary, the first week is all about theory. talking about Yoga:

  1. The purpose of attending this class
  2. Insights of Yoga
  3. Yoga Poses/gestures
  4. Benefits of different Yoga poses
  5. Analyze Yoga poses by tearing pieces (how?)
  6. In order to teach, you need to observe students (body) and the way how other teachers teach
  7. Communicate (engaging with other people)
  8. Deep understand your body, meditation (enlightment)
  9. Prana (up) vs Apana (down)

Giving an example:

When you do Adho mukha svansana (downard facing dog), a reset pose. When being a student, you just follow teacher's instruction, and do it(teacher will come to adjust your pose).

But when you want to teach, you will need to create your own script, then you need to observe the wrong and right of student's pose. Different people has different body flexibility, (maybe injuries), you need to instruct them with different instructions, modify the instruction and refine the pose for different individual.

I thought it is easy, just copy from a teacher (don't do this, if you do, please remember credit the teacher you copy). Even Yoga is totally open source, better to create your own.

Understanding the lines of energy, what does this mean?

E,g When you do Downward facing dog, your line of energy is streching from 2 perspectives.

  1. Horizontal line energy between your hand and you feet.
  2. Verical line energy on your hips towarding to sky.

And your spine should be straight, flat not other pose.

All postures relate to your breath, there are two types:

  • Energetic Yoga - more like Yin Yoga
  • Physical Yoga - more like yang Yoga

How I feel? As I first mentioned, I didn't know what the earth hack I signed up. But I still attended. Because I have full time job (which now is super busy), I have a full time master (now need to prepare for graduateion paper), and I am preparing for interviews (really want to get into companies like Google). So Why am I here, spending the whole weekend~~ and it will last for 4 months... (cry~~)

I'd like to try new things, and I liked Yoga, but when I was there, I felt totally different world (the world I didn't like much recently). which I need to engage a lot with other people, the world that I cannot be alone any more...

The first week, I saw other members are super excited. So who knows, maybe it will be benefit me on be more proactive. I decided that I will give it a try, and wait to see the different me in the near furture.

If you are into Yoga, if you want to be a Yoga teacher, books recommand:

[1] TeachingYoga Essential Foundations and Techniques Mark Stephens


alt text

Spanner structure:

  • universe
    • universemaster - displays status information about all the zones for inter- active debugging.
    • placement driver - handles auto- mated movement of data across zones on the timescale of minutes.
    • zones (can be roughly treated as a BigTable deployment)
      • One zonemaster - assign data to spanservers
      • 100-1000 spanservers - serve data to clients

spanserver:

alt text Tablet:

similar to BigTable abstraction: (key:string, timestamp:int64) -> string

Unlike Bigtable, Spanner assigns timestamps to data

Paxos:

  • Support replication. It implements a consistently replicated bag of mappings.
  • Writes must initiate the Paxos protocol at the leader; reads access state directly from the underlying tablet at any replica that is sufficiently up-to-date.
  • The set of replicas is collectively a Paxos group.
  • Inside paxos leader each spanserver also implements transaction manager to support distributed transactions
  • The transaction manager is used to imple- ment a participant leader; ******the other replicas in the group will be referred to as participant slaves. If a transaction involves only one Paxos group (as is the case for most transactions), it can bypass the transaction manager, since the lock table and Paxos together provide transactionality. If a transaction involves more than one Paxos group, those groups’ leaders coordinate to perform two-phase commit. the participant leader of that group will be referred to as the coordinator leader, and the slaves of that group as coordinator slaves.

Lock Table:

  • Implement concurrency control

  • The lock table contains the state for two-phase lock- ing: it maps ranges of keys to lock states.

  • Operations that require synchronization, such as transactional reads, acquire locks in the lock table; other operations bypass the lock table.

    Directories and Placement:

    • Directory: a set of contiguous keys that share a common prefix alt text

    • Difference between BigTable tablet and Spanner tablet: BigTable tablet is a lexicographically contiguous partition of the row space Spanner tablet is a container that may encap- sulate multiple partitions of the row space which enables it to colocate multiple directories that are frequently accessed together

    • Movedir is the background task used to move direc- tories between Paxos groups

    • It can also be used to add or remove replicas to Paxos Group Implemented to run in backend and transactionally

    Data Model:

    Spanner exposes the following set of data features to applications: a data model based on schematized semi-relational tables, a query language, and general- purpose transactions.

    • Semi-relational schema tables:

    • every table is required to have an ordered set of one or more primary-key columns. Client applications declare the hierarchies in database schemas via the INTERLEAVE IN declarations. ON DELETE CASCADE says that deleting a row in the directory table deletes any associated child rows.

      CREATE TABLE Users { uid INT64 NOT NULL, email STRING } PRIMARY KEY (uid), DIRECTORY;

      CREATE TABLE Albums { uid INT64 NOT NULL, aid INT64 NOT NULL, name STRING } PRIMARY KEY (uid, aid), INTERLEAVE IN PARENT Users ON DELETE CASCADE; alt text

    table structure

    True Time alt text

    TrueTime API

    • The table above lists the methods of TrueTime API. The underlying time references of TrueTime are GPS and atomic clocks which give a guarantee time interval of TT.now()

    Concurrency Control:

    1. Timestamp Management alt text
    • Read-only transaction:(must predeclared as no writes)

      • Reads in a read-only transaction execute at a system-chosen timestamp without locking, so that incoming writes are not blocked.
    • Snapshot read: A snapshot read is a read in the past that executes without locking.

    • Transactional Read-Write: two-phase locking

      • Assigned timestamps when all locks required and before any locks released

      • Depends on following invariant:

        • Spanner only assign monotonically increasing order of timestamps
      • Spanner also enforces following external consistency invariant:

        • Start(T2) > Strat(T1) => Commit(T2) > Commit(T1)

          • Start: coordinator assigns Si(commit timestamp) to Ti, Si >= TT.now().latest
          • Commit Wait: The coordinator leader ensures that clients cannot see any data committed by Ti until TT.after(si) is true. Commit wait ensures that si is less than the absolute commit time of Ti, or si < t(commit).

          alt text

      • Serving Reads at a Timestamp:

        • each replica has t(safe) which satisfy read time stamp t ≤ t(safe)

          • t(safe) = min(t(Paxos_safe), t(TM_safe) ) — t(Paxos_safe) is the safe time of each paxos group which is the last write commited, t(TM_safe) is Transaction Manager safe time: alt text

          s means every participant leader (for a group g) for a transaction Ti assigns a prepare timestamp

      • Assigning Timestamps to RO Transactions:

        A read-only transaction executes in two phases: assign a timestamp s(read), and then execute the transaction’s reads as snapshot reads at s(read).

    • Paxos Leader lease:

      • Leadership use lease (10s by default)
        • Potential leader sends requests for timed lease votes
        • Upon received quorum of lease notes, the leader knows it has a lease
        • Replica extends lease after successful write, leader request extension near expiration
    1. Details
    • Read-Write Transactions: writes that occur in a transaction are buffered at the client until commit.As a result, reads in a transaction do not see the effects of the transaction’s writes.

      1. The client issues reads to the leader replica of the appropriate group, which acquires read locks and then reads the most recent data.

      2. When a client has completed all reads and buffered all writes, it begins two-phase commit.

      3. The client chooses a coordinator group and sends a commit message to each participant’s leader with the identity of the coordinator and any buffered writes. (Having the client drive two-phase commit avoids send- ing data twice across wide-area links)

      4. A non-coordinator-participant leader first acquires write locks. It then chooses a prepare timestamp that must be larger than any timestamps it has assigned to previous transactions (to preserve monotonicity), and logs a prepare record through Paxos. Each participant then notifies the coordinator of its prepare timestamp.

      5. The coordinator leader also first acquires write locks, but skips the prepare phase. It chooses a timestamp for the entire transaction after hearing from all other participant leaders. TT(commit) > max(TT(prepare), TT.now(), TT(previous transaction))

    • Read-Only Transaction: Assigning a timestamp requires a negotiation phase be- tween all of the Paxos groups that are involved in the reads. As a result, Spanner requires a scope expression for every read-only transaction, which is an expression that summarizes the keys that will be read by the entire transaction. Spanner automatically infers the scope for standalone queries.

      • If the scope’s values are served by a single Paxos group, then the client issues the read-only transaction to that group’s leader.
      • If the scope’s values are served by multiple Paxos groups, there are several options.
        • The most complicated option is to do a round of communication with all of the groups’s leaders to negotiate sread based on LastTS().(LastTS() to be the timestamp of the last committed write at a Paxos group)
        • The client avoids a negotiation round, and just has its reads execute at sread = TT.now().latest (which may wait for safe time to advance). (Current Spanner solution)
    • Schema-Change Transactions: A Spanner schema-change transaction is a generally non-blocking variant of a standard transaction.

      1. It is explicitly assigned a timestamp in the future, which is registered in the prepare phase. As a result, schema changes across thousands of servers can complete with minimal disruption to other concurrent activity.
      2. Second, reads and writes, which implicitly depend on the schema, synchronize with any registered schema-change timestamp at time t: they may proceed if their times- tamps precede t, but they must block behind the schema- change transaction if their timestamps are after t.
    • Refinements:

      • t(TM_safe) as defined above has a weakness, in that a single safe prepared transaction prevents t **from advancing. As a result, no reads can occur at later timestamps, even if the reads do not conflict with the transaction. Such false conflicts can be removed by augmenting t(TM_safe) with a fine-grained mapping from key ranges to prepared- transaction timestamps. This information can be stored in the lock table, which already maps key ranges to lock metadata. When a read arrives, it only needs to be checked against the fine-grained safe time for key ranges with which the read conflicts.

      • use same way to augment LastTS()

      • t(Paxos_safe) won't advance without Paxos write. Solution: Each Paxos leader advances t(Paxos_safe) by keeping safe a threshold above which future writes’ timestamps will occur: it maintains a mapping MinNextTS(n) from Paxos sequence number n to the minimum timestamp that may be assigned to Paxos sequence number n + 1. A replica can advance t(Paxos_safe) to MinNextTS(n) 1 when it has applied through n.

        A single leader can enforce its MinNextTS() promises easily. Because the timestamps promised by MinNextTS() lie within a leader’s lease, the disjoint- ness invariant enforces MinNextTS() promises across leaders. If a leader wishes to advance MinNextTS() beyond the end of its leader lease, it must first extend its lease. Note that smax is always advanced to the highest value in MinNextTS() to preserve disjointness.

        A leader by default advances MinNextTS() values ev- ery 8 seconds. Thus, in the absence of prepared trans- actions, healthy slaves in an idle Paxos group can serve reads at timestamps greater than 8 seconds old in the worst case. A leader may also advance MinNextTS() values on demand from slaves.