Do inauspicious days influence child birth date selection?

Part 0 | Part 1 | Part 2

We stir the data-pile a bit more this time around. We start with the question – does a certain portion of the population avoid inauspicious days through appropriately chosen C-Section slots?

I did some digging and it’s pretty hard to map out all the inauspicious times over 2012-15. In any case we don’t have time stamps just birth dates. So the two possible things we could check for are whether certain:

  • dates (such as the 13th)
  • days (such as Tuesday)

are being materially under-represented.

Dates & Months

Dates&Months

Every date on average should see ~0.27% (1/365) of births for the year. Each number denotes deviation from average in basis points. i.e for instance Jan 1 sees 0.29% more births i.e more than double what is to be expected.

Conclusions:

a. 1 & 2nd Jan seem vastly over-represented – I first considered dropping 01Jan presuming this was owing to typos in the original data but it doesn’t seem to be the case

b. There is material relative under-representation of the 13th vs the 12th and the 14th – except for the month of October in almost every month fewer children were born on the 13th vs the 12th or the 14th

c. Apr, May and Jun are under-represented – this is presumably owing to parents “blue-shirting” their kids and securing admissions in Noida and Gurgaon unwilling to let their children “waste” a year risking Delhi school admissions – these children would then not apply to Delhi schools this year biasing the dataset.

Days of the Week:

DaysOfWeek

Conclusions:

Surprisingly good old fashioned sanity seems to prevail here – no one wants to hit an understaffed hospital on a Saturday or a Sunday if they an avoid it. Instead there is a spill-over effect into the front half of the following week and to some extent Thu & Fri.

The anticipated Tuesday drop is non-existent.

Advertisements

What Amit & Pooja decide to call their kids (Part 2) (Delhi Schools – 3/n)

Story so far: Part 0Part 1

Getting straight to it:

Girls vs Women

  • 8% of girls in the 2013-15 cohort have their name starting with AA
  • ~25% of all girls have a name starting with the letter A
  • SA continues to be the perennial favourite (5.5% in the 2013-15 cohort and 5.2% with the mums)
  • Names starting with NE and SH have dropped the most in popularity

Boys vs Men

  • 5.6% of boysin the 2013-15 cohort have their name starting with AA
  • ~20% of all boys have a name that starting with the letter A
  • VI continues to remain the perennial favourite (5.3% and 5.5%) – that said Vijay & Vikram have given way to Vihaan & Vivaan
  • Names starting with RA and SU have seen a steep drop in popularity
Girlsl1l2
Girls’ Names – First & Second Letters
Mumsl1l2
Mums’ Names – First & Second Letters
Girls vs Mums
Girls vs Mums (Blue = More common in Girls, Red = More common in mums)
Boysl1l2
Boys’ Names – First & Second Letter
Dadsl1l2
Dads’ Names – First & Second Letters
BoysvsDadsl1l2
Boys vs Dads (Blue = More common in boys, Red = More common in dads)

 

This tweet from @feelwelcome feels appropriate:

 

More when I find the time….

What Amit and Pooja decide to call their kids (Delhi Schools – 2/n)

Story so far: Part 1

A recent article about Indo Anglians was doing the rounds on my FB feed. This piece, in turn, references an older piece about First World Yoga Names. I needed little else to be inspired to poke around with the DoE dataset which lists roughly 78,000 unique individuals of the 2013-2015 year of birth cohort (I could tell you how many boys and girls but that’s a topic for a whole other post in itself) and roughly 2x that number of parents, all residents of NCR.

This allows one to explore the drift in popularity of first names a single generation.

I start by presenting the most popular first names in the parents cohort in descending order of frequency:

Men:

A little more than 2.5% of the nearly 80k adult men in the dataset were named Amit with Deepak coming in at a distant second.

Dads

Women:

Neha leads but doesn’t dominate quite the way Amit does with the men followed by Pooja, Priyanka, Jyoti and Preeti

Moms

Right, so what do Amit and Pooja name their kids then? Do the girls get named Kaira, Shyra and Shanaya as the article purports? And do the boys end up being Adi, Sid and Kabir?

Girls:

The winner by a mile here is some variation of Aaradhya/Aradhya/Aradya followed by Aadhya/Aadya. Taimur and Misha might be paparazzi favourites today but before them came Aaradhya Bachchan to inspire thousands of young parents. IMO the rest of the list is not as FWGN as you’d expect but I leave it to your judgement.

girls

Boys:

Without a doubt Aarav is the Amit of this generation. Also note the dominance of “VI” names which I hope to explore further in a subsequent post.

boys

That’s all for today folks. More when I find the time…

 

Delhi Schools – 1/n

It’s admission season here in Delhi and kiddo is in the fray. The DOE does a pretty neat job of putting up registered applicants school wise here which is great if you’re querying by school but sucks if you’re querying by student name. I wrote a little scraper + data reorganizer for my personal use last weekend.

I’ve tried wrapping it into a Google site here for anyone who is looking to get a summary of points across schools for their ward:

https://sites.google.com/view/delhi-school-search/home

I am not the front end guru so pardon the poor quality of the site. Please feel free to share with anyone who is looking to get a points summary for their kid.

Also to follow when I get the time: some interesting data visualizations.

 

Using Chinese Remainder Theorem IRL

In 1999 I had the privilege of attending the KRMO (Karnataka Regional Maths Olympiad) camp at IISc. While it didn’t do much to improve my math capabilities, it taught me a lot about how to preserve self-esteem when in the presence of materially smarter, sharper & more capable folks.

One of the quaint bits of math that I did learn was the Chinese Remainder Theorem and true to its name it has remained in the recesses of my brain since unused as yet another thing I learnt in school which I would never use.

Until today.

Earlier today I uploaded a large number of files into Google Drive in batches of 300 or so, a process that took close to an hour. When the upload was done Google helpfully let me know that some uploads in my last batch had failed.

Further Google Drive has some “helpful features” in that:

a. It creates duplicates of files when you re-upload without checking for conflicts

b. It doesn’t tell you the number of files that exist in a folder

I needed a quick way of figuring out how many files I was missing. Since only the last batch failed it was between 0 and 300.

Exasperated I listed contents and started counting the number of files manually. Not surprisingly I quickly lost count. I switched to grid mode and scrolled right down when I saw this:

pic1

I stretched out the window and saw this:

pic2.png

I had an epiphany – this was a job for CRT.

I quickly wrote down the number of files in the last row for different column sizes like this:

Number of Columns   Number of Files displayed in last row
7                   6
6                   2
5                   4
4                   2
3                   2


The number of files in the last row of each grid represents the remainder when N, the number of files in the folder, is divided by the number of columns being displayed. The question for CRT is to compute the total number of files N in the folder based on these remainders. This looks almost impossible to someone who hasn’t seen CRT at work.

After a few minutes of googling I found the method for reconstructing N here:

http://homepages.math.uic.edu/~leon/mcs425-s08/handouts/chinese_remainder.pdf

The math works out like so:

  1. Drop the 6 column case because we need co-prime divisors
  2. Compute the product of the co-prime divisors = z
  3. Compute z1..z4 as z/m1…z/m4
  4. Compute multiplicative inverse modulos y1..y4 (this takes a slight amount of effort)
  5. Compute SumProduct of Remainders, Divisor Products and MIMs
  6. Compute remainder R as the SumProduct modulo z

Table

CRT now tells us that N must be of the form k*z + R where k is an integer > 0

k=1 =>  420+314 =  714
k=2 =>  840+314 = 1154
k=3 => 1260+314 = 1574
k=4 => 1680+314 = 1994

I was looking to upload 1579 files and I had transferred at least 1200 until the last batch so 1574 was the number I was after. I was missing just 5 files.

Of course, this hardly solves the problem – I still need to find those 5 and re-upload them. 😐

Poetry and Progeny

In terms of firsts, the earliest memory I have of a book that I self-read is this combo of “Whiskers for a Cat and Bilderoo is coming”. When I extend the same question to poetry there is nothing that comes to mind.

Until today.

Thanks to kiddo I rediscovered Eleanor Farjeon and Cats which we read and re-read until one of us had had enough. The poem evokes strong, and very likely false, childhood memories of the stale wooden scent of classroom 2B, my grandpa’s deep voice and my desire to own a cat as a child.

Still trundling through this reverie I stumbled upon this meta gem:

I’ll tell you, shall I, something I remember?
Something that still means a great deal to me.
It was long ago.

A dusty road in summer I remember,
A mountain, and an old house, and a tree
That stood, you know,

Behind the house. An old woman I remember
In a red shawl with a grey cat on her knee
Humming under a tree.

She seemed the oldest thing I can remember.
But then perhaps I was not more than three.
It was long ago.

I dragged on the dusty road, and I remember
How the old woman looked over the fence at me
And seemed to know

How it felt to be three, and called out, I remember
‘Do you like bilberries and cream for tea?’
I went under the tree.

And while she hummed, and the cat purred, I remember
How she filled a saucer with berries and cream for me
So long ago.

Such berries and such cream as I remember
I never had seen before, and never see
Today, you know.

And that is almost all I can remember,
The house, the mountain, the gray cat on her knee,
Her red shawl, and the tree,

And the taste of the berries, the feel of the sun I remember,
And the smell of everything that used to be
So long ago,

Till the heat on the road outside again I remember
And how the long dusty road seemed to have for me
No end, you know.

That is the farthest thing I can remember.
It won’t mean much to you. It does to me.
Then I grew up, you see. 

 

Of course we had to finish with:

Five minutes, five minutes more, please!
Let me stay five minutes more!
Can’t I just finish the castle
I’m building here on the floor?
Can’t I just finish the story
I’m reading here in my book?
Can’t I just finish this bead-chain —
It almost is finished, look!
Can’t I just finish this game, please?
When a game’s once begun
It’s a pity never to find out
Whether you’ve lost or won.
Can’t I just stay five minutes?
Well, can’t I just stay just four?
Three minutes, then? two minutes?
Can’t I stay one minute more? 

As we reluctantly wound up for the evening I had to but wonder if kiddo would one day find a lost part of themselves while attempting to similarly educate a greener, and perhaps artificial, mind.

IIMA 2006 – 10y Reunion

20161217_114216-01

An earlier version of this piece appeared in the IIMA Alumnus Feb 2017 issue.

In Robert Pirsig’s seminal work, Zen and the Art of Motorcycle Maintenance, a student of the protagonist Phaedrus who is looking to write a five hundred word essay on the United States, finds herself at a loss for words, not knowing where to begin.

Trying to describe the Batch of 2006’s 10 year reunion weekend in a short passage, I’d like to think, is a similarly challenging task. After a considerable amount of time has been spent mulling over where to begin, it is very tempting to reduce the narrative to a mundane assortment of objective facts and the lowest common denominator of shared experience. I hope what follows does more than just that.
The new campus is a fantastic piece of architecture. Its stoic grey walls, while paying homage to Louis Kahn’s vision, have a character of their own. Twin ponds 20161217_213516-01.jpegof water lilies, surrounded by flocks of noisy pigeons welcomed us to the IMDC. These exquisite flowers came to life at dusk and were in full bloom at midnight which seemed like an appropriate metaphor for the nature and intensity of our own conversations and activities. Yet, unlike the pigeons, we chose not to congregate around these blooming flowers. For the heart longed for a joy and vivaciousness that only red can engender.

Some proponents of field theory would like to believe that us humans are devoid of an independent personality, but rather, that we can only find meaning in the context of our environment. The close to hundred of us who arrived on campus, brought with us a decade of calluses, battleworn from our careers and weighed down by the responsibilities that time and age have bestowed upon us. Fortunately, we found all manner of ways to moult and rediscover our younger selves, as we were, in simpler, and perhaps only in hindsight, happier, times. For some it was just being able to meet long-lost friends, while others found their salvation on the cricket ground. Yet others resorted to the familiar taste of Rambhai’s chai or the lunch thali at Agashiye to rekindle old memories. As night fell, stronger restoratives were employed to keep open the doors of perception, helping us maintain peak performance be it at the ramp or the poker table. Few, however, would disagree that any of these experiences would have held as much meaning outside the confines of those magic red bricks, the late night dew and chilly winds of LKP or the characteristic musty odour of CRs 3 through 6. At no time was this more apparent than when the clock struck midnight, when, irrespective of where we had been until then, we found ourselves migrating slowly in groups towards the old campus under the pretext of an after-dinner chai at CT, and staying back for hours at end to stroll through campus making sure the present generation of PGP1s were adequately focused on academics.

Going back at this point to the story of the young student struggling to write her essay, her professor Phaedrus, suggests that she try narrowing down her focus at first to just the city, then to a street, to a building and finally to a single brick, at which point she suddenly experiences a deluge of literary and creative output which leads her to fill many pages talking about just that brick. Perhaps there is more here then, than just a trick to get over writer’s block.

20161217_115125-01.jpeg

On the final day, as we bade each other our final goodbyes, there was unanimous agreement that the reunion had turned out better than our wildest expectations. A spartan affair, bereft of holiday destinations, celebrity appearances or pro shows, managed on a meager budget by a handful of enthusiastic folks. Perhaps all that was needed to infuse the weekend with meaning, fulfillment and happiness, was the people and the red bricks. A decade into our post IIMA lives, that does leave one wondering, as to how many of us have identified similar cornerstones to anchor the lives we were returning to and to make them more meaningful.

In closing, I’d like to thank those who were instrumental in making this experience truly special – the Alumni Office and the organizing committee, Director Nanda and Prof Basant for taking the time to speak with us, our friend and batchmate, Prof Amit Karna, who we are fortunate and proud to call one of our own, Prof Handa for the lovely mementos, Poza, Anu, Rejoy, Paldy, Mansur for the memories and the entertainment and finally Tahseen and DD, without whose tireless efforts in marshaling the batch into turning up in significant numbers and coordinating and managing payments and expenses this reunion would not have been such a resounding success.

What do you know about Warangal?

Hastily written, please excuse typos.

First of all I would like to thank everyone who took the time to take this poll on Facebook. You guys are fantastic people for having taken the time to do this frivolous poll and I hope it was fun.

Polling has been a topic of interest for a while and it has only been exacerbated by how wrong polls have been all of 2016.

Recently Suhas Mathur flagged to me this excellent paper on using meta-knowledge to improve polls. It is totally worth reading and if this is your cup of tea you should totally go read it (rather than the rest of this post)

Here is an example from the paper itself:

As an elementary example, consider the (false) proposition that Chicago is the capital of the state of Illinois. Respondents might form different opinions about the truth of the proposition, depending on whether they knew: (a) that Chicago is a large city, (b) that it is located in the state of Illinois, (c) that Springfield is the actual capital of Illinois, and so on. If the typical person is aware of (a) and (b) but not of (c), then the majority of those queried might vote for the incorrect answer, that the proposition is True. A democratic poll ignores the asymmetry in metaknowledge between respondents who know the right answer and those who do not. Those who know that Chicago is not the capital of Illinois can imagine that many others will be misled. A comparable insight into the opinions of others is not available to those who falsely believe the answer is Yes (22). Our scoring method in effect reweights the votes so as to reflect different levels of metaknowledge associated with each possible answer. If the method works as claimed, the true answer should emerge as the winner, regardless of how many respondents endorse it.

The immediate comparable I could think of in the Indian context was the recent creation of Telangana out of Andhra Pradesh. I picked Warangal as a test case – I expected my peer group to be pretty evenly split in guessing which state Warangal was now part of and I was not disappointed. For good measure I added on 9 other cities of what I thought would be varying degrees of difficulty which were all answered by participants correctly 60% or higher on average.

Warangal on the other hand only 21 of my first 41 participants identified correctly as belonging to Telangana. Could Prelec & Seung’s method do better? Was it just guesswork or did the metaknowledge of those that answered correctly have some value?

Prelec & Seung advise the creation of a Bayesian Truth Serum (BTS) in order to identify expert subsets within each cohort. Higher BTS scores should suggest greater expertise and this should in turn result in more accurate results.

I find that the chart below provides significant validation to the concept.

The orange line tracks the average performance of a random subset of participants. For instance my first 5 participants (who were perhaps expert quizzers who got cracking without much prodding) all got the answer correct but those that followed have mixed performance taking the average down to just above 50% towards the end.

BTS Sorting allows us to capture the first 13 correct answers in order before a somewhat monotonic convergence towards the average which is commendable.

warangal

Here is the same method of analysis from Prelec & Seung

PrelecSeung.PNG

And finally the real MVP with highest average BTS score is Mansur Ahamed who incidentally got all answers correct except Warangal (in all fairness he was tied with someone who chose to remain anonymous by not signing in via on FB)

Remembering Michael Crichton

jp

This month marks the late Michael Crichton’s 74th birth anniversary and it got me reminiscing about Jurassic Park.

Released in 1993, the film made box office history led to many other Crichton stories being filmed (Disclosure, Rising Sun, Congo, Sphere, Eaters of the Dead and of course Lost World). I have read a significant chunk of what Crichton has had to offer going back to Terminal Man and right up to Next. I don’t care much for his work earlier work under the pseudonym John Lange though Case of Need could be interesting to some.

1994

I had just donned spanking new three strings that summer which I took to be a sign of adulthood and marked the occasion by making an earnest transition from Enid Blyton to the Hardy Boys and Three Investigators and refusing to read “comics”.

After having refused to let me see Jurassic Park in the theaters my dad came home one day with a crisp new paperback. He doesn’t really read fiction and the only non-religious books on his shelf to this day are The Tao of Physics & A Brief History of Time.

The embossed black T-Rex skeleton on the cover was inviting and I picked up the book when he wasn’t looking. I was hooked from the start. I eagerly read everything that Crichton had written until then and remember being thoroughly disappointed with Disclosure which had none of the sci-fi/action plot elements of his earlier works. (Rising Sun was at least entertaining.) I quickly moved on to Asimov, Clarke and other writers who were “truer” to the sci-fi genre and haven’t looked back since.

I did faithfully leaf through every new book Crichton came up with since but he was clearly past his prime by the early 2000s. After showing some promise with Timeline and Prey, in an almost M Night Shyamalan fashion, the quality of his work dwindled until I couldn’t force myself get to the end of Pirate Latitudes and after reading the sample of Micro recently chose not to download it on my Kindle.

But then again I came here to praise Caesar, not to bury him.

1969

After dishing out a series of potboilers, Crichton had just written Andromeda Strain, his first real attempt at somewhat “hard” sci-fi, which was yet to achieve critical acclaim. He writes a review of Vonnegut’s Slaughterhouse Five in The New Republic which you can read here which opens a window into his mind. He uses the pretext of a review to lament the lack of good sci-fi that is also good fiction. He blames Verne and Wells for being romantics, berates Heinlein, Zelazny and Ballard for exploiting drug culture and accuses Kurt Vonnegut of being a charlatan – ‘There is also some business about a distant planet and flying saucers, but that does not make the book science fiction, any more than flippers make a cat a penguin.’ And then having finished this review he went on to write EXACTLY what he thought good science fiction should be for the next 30 years, making the genre both accessible and mainstream, giving millions of 80s kids a taste of it and getting them hooked for life. 

Thank you for that Michael Crichton and Happy Birthday!

And so it goes.

10 years since Scanner Darkly – CNN vs Rotoscoping

A_Scanner_Darkly_Poster

Scanner Darkly (the movie) directed by the brilliant Richard Linklater came out in July 2006. The movie has so many things to talk about including the war on drugs (and why it cannot be won), PKD’s own creativity being fuelled by his substance abuse, how scramble suits could be the way to end racism etc.

Instead, I’ve decided to pick form over substance and talk about the animation instead because of two stats:

  • it took 18 months to animate the movie in 2006
  • with Convolutional Neural Networks it might just take a day to do that by the end of 2016 (I exaggerate highly, but read on)

Back in 2006, I was amazed by the animation technique so I tried to read up on Interpolated Rotoscoping which was the technique used to create the movie. I didn’t get very far but here is a nice 4 min video about it and some quotes from in there:

we shot the actual film and we locked … and then there was a lengthy post-production process in this case was 18 months

the animation process which is so cumulative and so slow – hundreds of hours to do 1 minute

we thought it would take 350 man hours per minute, we were pretty off on that it took a lot longer

 

Fast forward to 2015-16 we have

1. CNN and this paper

2. This Torch implementation on github

3. Ostagram becoming a big thing overnight

4. Prisma

They also have plans for video, with Moiseenkov saying their processing technique can still work quickly enough for a mobile video scenario.

“Photos is only the start. We plan to add something like the Boomerang app from Instagram. Like short cycles. We plan to add them in the near future — I think in July. And some sort of very clever filters where the quality will be superb,” he adds.

So potentially by the end of 2016 that 18 months of work and ~100,000 man-hours of effort could effectively become a day or less of work for a powerful CNN – I’ll stop there and leave you to think about that.