New Chinese Science Fiction

This time I do a digression from my usual themes to talk about Chinese Science Fiction. But you will see that is not a great deviation: some of the topics are familiar; as William Gibson famously said: The future is already here — it’s just not very evenly distributed.

I have always been a science fiction fan and I noticed in the last years that the genre is booming in China (maybe as a way to more freely talk about current affairs disguising them about speculative fiction?)

Here are three books that I enjoyed lately.  The links are to the authors or publishers pages, not any book retail page. Continue reading “New Chinese Science Fiction”

An introduction to logistic regression

Variables can be described as either quantitative or qualitative.
Quantitative variables have a numerical value, e.g. a person’s income, or the price of a house.
Qualitative variables have a values taken from one of different classes or categories. E.g., a person’s gender (male or female), the type of house purchased (villa, flat, penthouse, …) the colour of the eye (brown, blue, green) or a cancer diagnosis.

Linear regression predicts a continuous variable but sometime we want to predict a categorical variable, i.e. a variable with a small number of possible discrete outcomes, usually unordered (there is no order among the outcomes).

This kind of problems are called Classification.

Classification

Given a feature vector X and a qualitative response y taking values from one fixed set, the classification task is to build a function f(X) that takes as input the feature vector X and predicts its value for y.
Often we are interested also (or even more) in estimating the probabilities that X belongs to each category in C.
For example, it is more valuable to have the probability that an insurance claim is fraudulent, than if a classification is fraudulent or not.

There are many possible classification techniques, or classifiers, available to predict a qualitative response.

We will se now one called logistic regression.

Note: this post is part of a series about Machine Learning with Python.
Continue reading “An introduction to logistic regression”

Google and Microsoft blend AI into core products

I recently watched parts of both Google and Microsoft developer conferences (respectively Build 2017 and I/O 2017).
As expected, there was big emphasis on Artificial Intelligence but, in all, I liked more the Microsoft’s one while the Google’s felt too heterogeneous and without real meat (the new capabilities from Google Lens have been available e.g. at Baidu since years).

A few things that attracted my curiosity:

Vision plus X is the killer app of AI

At Google I/O, Dr. Fei-Fei Li – the new Chief Scientist of AI/ML at Google Cloud – articulated the most convincing vision: Continue reading “Google and Microsoft blend AI into core products”

Machines “think” differently but it’s not a problem (maybe)

Yet another article about the interpretability problem of many AI algorithms, this time on the MIT Technology Review, May/June 2017 issue.

The issue is clear; many of the most successful recent AI technologies revolve around deep learning: complex artificial neural networks – with so many layers of so many neurons transforming so many variables – that behave like “black boxes” for us.
We cannot comprehend anymore the model, we don’t know how or why the outcome to a specific input is obtained.
Is it scary?

In the film Dekalog 1 by Krzysztof Kieślowski – the first of ten short films inspired to the ten Christian imperatives, the first one being “I am the Lord your God; you shall have no other gods before me”  – Krzysztof lives alone with Paweł, his 12-years-old and highly intelligent son, and introduces him to the world of personal computers. Continue reading “Machines “think” differently but it’s not a problem (maybe)”

Agile for managing a research data team

 

An interesting read: Lessons learned managing a research data science team on the ACMqueue magazine by Kate Matsudaira.

The author described how she managed a data science team in her role as VP engineering at a data mining startup.

When you have a team of people working on hard data science problems, the things that work in traditional software don’t always apply. When you are doing research and experiments, the work can be ambiguous, unpredictable, and the results can be hard to measure.

These are the changes that the team implemented in the process: Continue reading “Agile for managing a research data team”

[Link] Algorithms literature

From the Social Media Collective, part of the Microsoft Research labs, an interesting and comprehensive list of studies about algorithms as social concern.

Our interest in assembling this list was to catalog the emergence of “algorithms” as objects of interest for disciplines beyond mathematics, computer science, and software engineering.

They also try to categorise the studies and add an intriguing timeline visualisation (that shows how much interest are sparking the algorithms in this time):

timeline