用20行Python构建Markov Chain语句生成器

A bot who can write a long letter with ease, cannot write ill. —Jane Austen, Pride and Prejudice 这篇文章将引导您逐步学习如何使用Python从头开始编写马尔可夫链(Markov Chain),以生成好像一个真实的人写的英语的全新句子。 简·奥斯丁的《傲慢与偏见》(Pride and Prejudice by Jane Austen) 是我们用来构建马尔可夫链的文字。 Colab 上有一篇可运行的笔记本版本。 Read the English version of this post here. Setup 首先下载“傲慢与偏见”的全文。 # 下载Pride and Prejudice和并切断头. !curl https://www.gutenberg.org/files/1342/1342-0.txt | tail -n+32 > /content/pride-and-prejudice.txt # 预览文件. !head -n 10 /content/pride-and-prejudice.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 707k 100 707k 0 0 1132k 0 --:--:-- --:--:-- --:--:-- 1130k PRIDE AND PREJUDICE By Jane Austen Chapter 1 It is a truth universally acknowledged, that a single man in possession 添加一些必要的导入。

Machine Learning Reference

I often need to look up random bits of ML-related information. This post is a currently work-in-progress attempt to collect common machine learning terms and formulas into one central place. I plan on updating this post as I come across further useful pieces of information as needed. This reference is not intended to be exhaustive—in fact the opposite—it is intended only to be a concise, opinionated collection of the most relevant bits of ML knowledge for quick lookup.

Vanguard Funds vs ETFs

After researching Vanguard funds vs. ETFs I still haven’t found a good resource that lists in detail the benefits and downsides of each. A Vanguard mutual fund is provided and managed by Vanguard. You can only buy Vanguard funds on vanguard.com or over the phone. A Vanguard Exchange Traded Fund is packaged up like a stock and its shares can be traded on any market with any brokerage account. This is my attempt to compile a comprehensive list of tradeoffs.

The Most Commonly Used Chinese Words

I think the most effective way to learn a language is to prioritize learning the day-to-day most frequently used words. Picking words to study in order of frequency is the optimal way to maximally increase your marginal understanding of the language for each successive word you learn. To that end, I took a dataset of Weibo (China’s Twitter) posts and ranked all words that appeared in the dataset order of frequency.

Napa Half Race Report

The end of the pain cave. Note: the clock time in the picture is for a different race. 😆 I PR’d my half marathon at the Napa Half!! Yeahh!!!! Old PR: 1:31:37 at the 2019 Kaiser SF Half. New PR: 1:28:59 (link). Improvement: 2.9% (-2:38) In this post I want to look into the race and more closely analyze what went well, what could have gone better, and what was interesting about this race.

More Recent Posts

  1. Stop Using Bookmarks  Fri, Mar 6, 2020
  2. 2019 Year in Review  Tue, Dec 31, 2019
  3. The Best Books I Read in 2019  Sat, Dec 28, 2019
  4. CIM Race Report  Thu, Dec 12, 2019
  5. Marathon Training Update  Thu, Nov 21, 2019
  6. I'm Joining Waymo  Sun, Nov 10, 2019
  7. Things I Learned as a First-Time Intern Host  Sat, Aug 31, 2019
  8. How are Words Represented in Machine Learning?  Sat, Jul 13, 2019
  9. 3 Tips for New Technical Interviewers  Sat, Jul 6, 2019
  10. On Being Injured (Again)  Fri, Jun 14, 2019

Most Popular Posts

  1. Build a Markov Chain Sentence Generator in 20 lines of Python  Wed, Jan 16, 2019
  2. How to Solve Every Software Engineering Interview Question  Tue, Nov 20, 2018
  3. How are Words Represented in Machine Learning?  Sat, Jul 13, 2019
  4. How to Export Evaluation Results in Tensorflow  Fri, Jan 5, 2018

All Posts by Category

Software Engineering

用20行Python构建Markov Chain语句生成器  Sat, Jul 4, 2020
Machine Learning Reference  Sun, May 17, 2020
I'm Joining Waymo  Sun, Nov 10, 2019
How I Host Static Sites With Automatic Deploy on Green  Sun, Feb 3, 2019
Build a Markov Chain Sentence Generator in 20 lines of Python  Wed, Jan 16, 2019
It's OK to Make Mistakes in Coding Interviews  Tue, Sep 25, 2018
Doing Cryptography in TensorFlow  Sat, Jun 23, 2018
Understanding the Security of Cryptographic Hash Functions  Mon, Apr 16, 2018
Quantum-resistant crypto, Elliptic Curves, and other learnings  Sat, Mar 31, 2018
How to Export Evaluation Results in Tensorflow  Fri, Jan 5, 2018
Example: Save and Load a TensorFlow Model  Sun, Nov 19, 2017
Chrome Security Architecture  Sun, Oct 22, 2017
Where Web Payments are Going  Sun, Jul 31, 2016
You are an engineering manager whether you realize it or not  Mon, Jul 25, 2016
Reading Papers: Bufferbloat, SSL Warnings, Orleans, and more  Tue, Mar 24, 2015
Visualizing JavaScript Project Structure  Mon, Jan 12, 2015
How System Calls Work  Thu, Sep 18, 2014
Unforseen Perks of Pair Programming  Sun, Jun 15, 2014
Interviewing 2 Years in: What Worked  Fri, Jun 6, 2014
Intro to Angular.js Talk  Mon, Apr 28, 2014
Things learned while preparing for Angular Live Code  Tue, Apr 22, 2014
Two interesting IE JavaScript quirks  Fri, Mar 14, 2014
JavaScript's Mutative vs. Non-Mutative Array Methods  Thu, Feb 20, 2014
Japanese Programming  Mon, Feb 25, 2013

Running

Napa Half Race Report  Wed, Mar 11, 2020
CIM Race Report  Thu, Dec 12, 2019
Marathon Training Update  Thu, Nov 21, 2019
On Being Injured (Again)  Fri, Jun 14, 2019
Kaiser SF Half Race Report  Wed, Feb 6, 2019
Building a Running Pace Calculator With AMP  Sun, Feb 3, 2019
2018 Year in Review  Mon, Dec 24, 2018
Reasons to Try Trail Running  Thu, Sep 27, 2018
China Camp Trail Race Report: Things I Wish I Had Known  Sun, Jun 3, 2018
2016 Year in Review  Sat, Dec 31, 2016
Jeff's 2015 in 5 Themes  Mon, Jan 11, 2016
Breaking the Cycle  Sun, Nov 29, 2015
Wrist Upgrade: Suunto Ambit3  Sun, Oct 25, 2015

Travel

Sydney Photos  Thu, Jul 26, 2018
Mt. Rainier Backpacking Trip  Mon, Feb 27, 2017
Ueno Zoo 上野動物園  Tue, Feb 7, 2012
Tsukiji Fish Market 築地市場  Thu, Feb 2, 2012
January in Japan  Wed, Feb 1, 2012
Tokyo Tower 東京タワー  Fri, Jan 27, 2012
Nara 奈良  Sat, Jan 7, 2012
Fushimi Inari-taisha 伏見稲荷大社  Thu, Jan 5, 2012
Kiyomizu Temple 清水寺  Tue, Jan 3, 2012
Osaka Aquarium 海遊館  Tue, Jan 3, 2012
Study Abroad in Japan  Mon, Jan 3, 2011

Japan

私の日本語勉強経験 - My Experience Learning Japanese  Sun, Oct 14, 2018
Ueno Zoo 上野動物園  Tue, Feb 7, 2012
Tsukiji Fish Market 築地市場  Thu, Feb 2, 2012
January in Japan  Wed, Feb 1, 2012
Tokyo Tower 東京タワー  Fri, Jan 27, 2012
Nara 奈良  Sat, Jan 7, 2012
Fushimi Inari-taisha 伏見稲荷大社  Thu, Jan 5, 2012
Kiyomizu Temple 清水寺  Tue, Jan 3, 2012
Osaka Aquarium 海遊館  Tue, Jan 3, 2012
Study Abroad in Japan  Mon, Jan 3, 2011

Photos

Sydney Photos  Thu, Jul 26, 2018
Ueno Zoo 上野動物園  Tue, Feb 7, 2012
Tsukiji Fish Market 築地市場  Thu, Feb 2, 2012
January in Japan  Wed, Feb 1, 2012
Tokyo Tower 東京タワー  Fri, Jan 27, 2012
Nara 奈良  Sat, Jan 7, 2012
Fushimi Inari-taisha 伏見稲荷大社  Thu, Jan 5, 2012
Kiyomizu Temple 清水寺  Tue, Jan 3, 2012
Osaka Aquarium 海遊館  Tue, Jan 3, 2012

Machine Learning

用20行Python构建Markov Chain语句生成器  Sat, Jul 4, 2020
Machine Learning Reference  Sun, May 17, 2020
I'm Joining Waymo  Sun, Nov 10, 2019
How are Words Represented in Machine Learning?  Sat, Jul 13, 2019
Build a Markov Chain Sentence Generator in 20 lines of Python  Wed, Jan 16, 2019
Doing Cryptography in TensorFlow  Sat, Jun 23, 2018
How to Export Evaluation Results in Tensorflow  Fri, Jan 5, 2018
Example: Save and Load a TensorFlow Model  Sun, Nov 19, 2017

Front End

Building a Running Pace Calculator With AMP  Sun, Feb 3, 2019
How I Host Static Sites With Automatic Deploy on Green  Sun, Feb 3, 2019
Where Web Payments are Going  Sun, Jul 31, 2016
Visualizing JavaScript Project Structure  Mon, Jan 12, 2015
Intro to Angular.js Talk  Mon, Apr 28, 2014
Things learned while preparing for Angular Live Code  Tue, Apr 22, 2014
Two interesting IE JavaScript quirks  Fri, Mar 14, 2014
JavaScript's Mutative vs. Non-Mutative Array Methods  Thu, Feb 20, 2014

Life

2019 Year in Review  Tue, Dec 31, 2019
I'm Joining Waymo  Sun, Nov 10, 2019
2018 Year in Review  Mon, Dec 24, 2018
Goodbye 2017, Hello 2018  Sun, Dec 31, 2017
2016 Year in Review  Sat, Dec 31, 2016
Jeff's 2015 in 5 Themes  Mon, Jan 11, 2016
Interviewing 2 Years in: What Worked  Fri, Jun 6, 2014

Video

Mt. Rainier Backpacking Trip  Mon, Feb 27, 2017
Intro to Angular.js Talk  Mon, Apr 28, 2014
Pechakucha Waterville Talk  Tue, May 1, 2012
Study Abroad in Japan  Mon, Jan 3, 2011
Huge Snow  Wed, Apr 21, 2010

Interviewing

I'm Joining Waymo  Sun, Nov 10, 2019
3 Tips for New Technical Interviewers  Sat, Jul 6, 2019
How to Solve Every Software Engineering Interview Question  Tue, Nov 20, 2018
It's OK to Make Mistakes in Coding Interviews  Tue, Sep 25, 2018
Interviewing 2 Years in: What Worked  Fri, Jun 6, 2014

Yearly Review

2019 Year in Review  Tue, Dec 31, 2019
2018 Year in Review  Mon, Dec 24, 2018
Goodbye 2017, Hello 2018  Sun, Dec 31, 2017
2016 Year in Review  Sat, Dec 31, 2016
Jeff's 2015 in 5 Themes  Mon, Jan 11, 2016

JavaScript

Visualizing JavaScript Project Structure  Mon, Jan 12, 2015
Things learned while preparing for Angular Live Code  Tue, Apr 22, 2014
Two interesting IE JavaScript quirks  Fri, Mar 14, 2014
JavaScript's Mutative vs. Non-Mutative Array Methods  Thu, Feb 20, 2014

Leadership

Things I Learned as a First-Time Intern Host  Sat, Aug 31, 2019
Reading Notes: The Manager's Path  Wed, Nov 14, 2018
You are an engineering manager whether you realize it or not  Mon, Jul 25, 2016
Unforseen Perks of Pair Programming  Sun, Jun 15, 2014

Books

Stop Using Bookmarks  Fri, Mar 6, 2020
The Best Books I Read in 2019  Sat, Dec 28, 2019
Reading Notes: Mindset: The New Psychology of Success  Mon, Dec 10, 2018
Reading Notes: The Manager's Path  Wed, Nov 14, 2018

Chinese

The Most Commonly Used Chinese Words  Sun, Mar 15, 2020
Measuring My Chinese Progress  Sat, Apr 27, 2019
More Similar Mandarin Words  Sat, Oct 20, 2018
Some Similar Mandarin Words  Sat, Jun 23, 2018

Talks

Chrome Security Architecture  Sun, Oct 22, 2017
Where Web Payments are Going  Sun, Jul 31, 2016
Intro to Angular.js Talk  Mon, Apr 28, 2014
Pechakucha Waterville Talk  Tue, May 1, 2012

Cryptography

Doing Cryptography in TensorFlow  Sat, Jun 23, 2018
Understanding the Security of Cryptographic Hash Functions  Mon, Apr 16, 2018
Quantum-resistant crypto, Elliptic Curves, and other learnings  Sat, Mar 31, 2018

Metaengineering

Setting Up a Recruiter Auto-reply Bot  Wed, Jun 5, 2019
A Year of the Pomodoro Technique  Tue, Jan 29, 2019
Unforseen Perks of Pair Programming  Sun, Jun 15, 2014

NLP

用20行Python构建Markov Chain语句生成器  Sat, Jul 4, 2020
How are Words Represented in Machine Learning?  Sat, Jul 13, 2019
Build a Markov Chain Sentence Generator in 20 lines of Python  Wed, Jan 16, 2019

Meta

New Blog, Who Dis?  Fri, Mar 30, 2018
Hello World, Again  Sat, Feb 8, 2014

Career

I'm Joining Waymo  Sun, Nov 10, 2019
Interviewing 2 Years in: What Worked  Fri, Jun 6, 2014

College

Pechakucha Waterville Talk  Tue, May 1, 2012
Huge Snow  Wed, Apr 21, 2010

Japanese

私の日本語勉強経験 - My Experience Learning Japanese  Sun, Oct 14, 2018
Japanese Programming  Mon, Feb 25, 2013

Security

Chrome Security Architecture  Sun, Oct 22, 2017

Quantified Self

Pechakucha Waterville Talk  Tue, May 1, 2012

Payments

Where Web Payments are Going  Sun, Jul 31, 2016

Filmmaking

Pechakucha Waterville Talk  Tue, May 1, 2012

Finance

Vanguard Funds vs ETFs  Thu, Apr 2, 2020

Devops

How I Host Static Sites With Automatic Deploy on Green  Sun, Feb 3, 2019