Haskell 基本语法（一）列表与类型系统

算术与逻辑运算

算术运算：

Prelude> 2 + 15
17
Prelude> 5 / 2
2.5
Prelude> 50 * (100 - 4999)
-244950
Prelude> 5 * -3

<interactive>:4:1: error:
    Precedence parsing error
        cannot mix ‘*’ [infixl 7] and prefix `-' [infixl 6] in the same infix expression
Prelude> 5 * (-3)
-15

逻辑运算：

Prelude> True && False
False
Prelude> False || True
True
Prelude> not (True && False)
True

判断是否相等：

Prelude> 5 == 5
True
Prelude> 5 == 4
False
Prelude> 5 /= 4
True
Prelude> "hello" == "hello"
True

函数调用

在 Haskell 中，+ - * / 等操作符实际上也是函数，只不过调用时函数名位于两个参数之间，叫做 infix 函数。
其他常见的函数为 prefix 函数，通过函数名+空格+参数的格式（fun a b ...）调用。

Prelude> succ 8
9
Prelude> min 9 10
9
Prelude> succ 9 + max 5 4 + 1
16

接收两个参数的函数也可以在调用时将函数名放在参数中间，如：

Prelude> div 10 2
5
Prelude> 10  `div` 2
5

Haskell 中传递给函数的参数不需要像 C 语言中那样放置在 () 中，因此 bar (bar 3) 实际上等同于 C 中的 bar(bar(3))。

函数定义

Prelude> doubleMe x = x + x
Prelude> doubleMe 9
18
Prelude> doubleMe 8.3
16.6

Prelude> doubleSmallNumber x = if x > 100 then x else x * 2
Prelude> doubleSmallNumber 123
123
Prelude> doubleSmallNumber 80
160

Haskell 中的 if 语句是一种表达式。表达式是指某一段有返回值的代码片段。
比如 5 是表达式，返回数字 5；x + y 也是表达式，返回 x 与 y 的和。
因此 Haskell if 语句中的 else 是必需的（保证一定有返回值）。

list

Haskell 中的列表只能存放同一类型的数据项。

1
2
3

Prelude> let a = [1,2,3,4]
Prelude> a
[1,2,3,4]

Haskell 中的字符串实际上是数据项类型为 Char 的列表，"hello" 仅仅是 ['h','e','l','l','o'] 的一种语法糖。

Prelude> ['h','e','l','l','o']
"hello"
Prelude> ['h','e','l','l','o'] == "hello"
True
Prelude> :t "hello"
"hello" :: [Char]

列表通过 ++ 符号执行连接操作。

Prelude> [1,2,3,4] ++ [5]
[1,2,3,4,5]
Prelude> "hello" ++ " " ++ "world"
"hello world"

PS：使用 ++ 操作符连接两个列表时，即便右边的列表只包含一个数据项，也需要用 [] 括起来。
不管右边添加的列表有多少数据项，左边的列表都会在合并时遍历自身的所有项。

可以使用 : 操作符在列表左侧添加一个数据项。

Prelude> 'A' : " Small Cat"
"A Small Cat"
Prelude> 5 : [1,2,3,4,5]
[5,1,2,3,4,5]

PS：[1,2,3] 实际上是 1:2:3:[] 的语法糖。

1 2	Prelude> 1:2:3:[] [1,2,3]

使用 !! 操作符根据索引获取列表中的某个数据项。

Prelude> [1,2,3,4] !! 0
1
Prelude> "hello" !! 1
'e'

elem 可以判断某个数据项与列表的包含关系。

Prelude> elem 4 [3,4,5,6]
True
Prelude> elem 100 [3,4,5,6]
False

比较列表的大小时，会从列表左侧开始逐个数据项进行比对。

Prelude> [3,2,1] > [3,1,0]
True
Prelude> [3,2,1] > [2,10,100]
True
Prelude> [3,4,2] > [3,4]
True

常见的作用于列表的函数：

head：获取列表的首个元素
tail：获取列表的尾部（除首个元素以外的）元素
last：获取列表的最后一个元素
init：获取列表的前几个（除最后一个元素以外）元素
length：返回列表长度
null：判断列表是否为空
reverse：逆序输出源列表
minimum：获取列表中的最小值
maximum：获取列表中的最大值
sum：获取列表中所有元素的加和
product：获取列表中所有元素的乘积

Prelude> head [5,4,3,2,1]
5
Prelude> tail [5,4,3,2,1]
[4,3,2,1]
Prelude> last [5,4,3,2,1]
1
Prelude> init [5,4,3,2,1]
[5,4,3,2]
Prelude> length [5,4,3,2,1]
5
Prelude> null [5,4,3,2,1]
False
Prelude> reverse [5,4,3,2,1]
[1,2,3,4,5]

其他如对列表的 subset 操作，take 函数可以获取列表中的前几个数据项（即生成子列表），drop 可以获取列表中除前几项以外的其他数据项。

Prelude> take 3 [5,4,3,2,1]
[5,4,3]
Prelude> drop 3 [5,4,3,2,1]
[2,1]
Prelude> take 0 [5,4,3,2,1]
[]
Prelude> take 10 [5,4,3,2,1]
[5,4,3,2,1]
Prelude> drop 100 [5,4,3,2,1]
[]

range

Prelude> [1..10]
[1,2,3,4,5,6,7,8,9,10]
Prelude> ['a' .. 'z']
"abcdefghijklmnopqrstuvwxyz"

包含步长的 range，比如获取 2 到 20 之间的所有偶数，和获取 3 到 20 之间所有3 的倍数：

Prelude> [2,4..20]
[2,4,6,8,10,12,14,16,18,20]
Prelude> [3,6..20]
[3,6,9,12,15,18]

从语法上看，[] 中需包含前两项以及最后一项（的范围）。
因此获取 20 到 1 的数字列表则可以使用 [20,19..1]。

1 2	Prelude> [20,19..1] [20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1]

此外，获取从 13 开始共 24 个 13 的倍数，可以使用 [13,26..24*13]，也可以使用 take 24 [13,26..]。
没有提供最后一项的范围（如 [13,26..]）时，range 方式会生成无穷列表。Haskell 的计算是 lazy 的，因此不用担心无穷列表会吃掉所有内存。

生成无穷列表还可以使用 cycle 或者 repeat：

Prelude> take 10 (cycle [1,2,3])
[1,2,3,1,2,3,1,2,3,1]
Prelude> take 10 (repeat 5)
[5,5,5,5,5,5,5,5,5,5]

列表推导

Haskell 中的列表推导，写法上很像单纯的数学公式 $S = {2 * x | x \in \mathbb{N}, x <= 10}$

1 2	Prelude> [x*2 \| x <- [1..10]] [2,4,6,8,10,12,14,16,18,20]

等同于 Python 中的如下表达式：

1 2	>>> [x * 2 for x in range(1, 11)] [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

更复杂的情况如：

1 2	Prelude> [x2 \| x <- [1..10], x2 >= 12] [12,14,16,18,20]

甚至可以有如下用法：

Prelude> [x*y | x <- [2,5,10], y <- [8,10,11]]
[16,20,22,40,50,55,80,100,110]
Prelude> [x*y | x <- [2,5,10], y <- [8,10,11], x*y > 50]
[55,80,100,110]

借助列表推导可以定义自己的 length 函数：

1
2
3

Prelude> length xs = sum [1 | _ <- xs]
Prelude> length [1,2,3,4]
4

定义函数去除某个列表中所有的非大写字符：

1
2
3

Prelude> removeNonUppercase st = [c | c <- st, elem c ['A' .. 'Z']]
Prelude> removeNonUppercase "Hahaha! Ahahaha!"
"HA"

Tuple

Haskell 中的元组相对于列表主要有以下特性：

元组的类型由所含元素的长度和每个元素的类型确定
元组中可以包含不同类型的元素

如 ("Christopher", "Walken", 55) 这样的元组是合法的，即单个元组中可以包含字符串（列表）、数字等不同类型；
[(1,2),(8,11,5),(4,5)] 和 [(1,2),("One",2)] 这样的列表则是不合法的，因为不同长度或者元素类型不同的元组，其类型也是不同的，不能作为同一个列表中的元素。

Prelude> :t (1,2)
(1,2) :: (Num t, Num t1) => (t1, t)
Prelude> :t (8,11,5)
(8,11,5) :: (Num t, Num t1, Num t2) => (t2, t1, t)
Prelude> :t ("one",2)
("one",2) :: Num t => ([Char], t)

fst 可以返回元组的第一个元素，snd 返回元组的第二个元素。这两个函数只作用于长度为 2 的元组。

Prelude> fst (8,11)
8
Prelude> snd ("Wow", False)
False

zip 可以将两个列表中的每一个元素一一组合成长度为二的元组，最终形成新的以元组为元素的列表。

Prelude> zip [1,2,3,4,5] [5,5,5,5,5]
[(1,5),(2,5),(3,5),(4,5),(5,5)]
Prelude> zip [1..5] ["one", "two", "three", "four", "five"]
[(1,"one"),(2,"two"),(3,"three"),(4,"four"),(5,"five")]
Prelude> zip [1..] ["apple", "orange", "cherry", "mango"]
[(1,"apple"),(2,"orange"),(3,"cherry"),(4,"mango")]

类型系统

Haskell 是静态类型的语言，每一个表达式在编译时其类型便已知。
不同于 Java 等语言，Haskell 支持类型推断。它可以自行推断出某个数字属于 Int 类型。

Prelude> :t 'a'
'a' :: Char
Prelude> :t True
True :: Bool
Prelude> :t "HELLO!"
"HELLO!" :: [Char]
Prelude> :t (True, 'a')
(True, 'a') :: (Bool, Char)
Prelude> :t ('a','b','c')
('a','b','c') :: (Char, Char, Char)
Prelude> :t 4 == 5
4 == 5 :: Bool

:: 读作 has type of 。
元组的类型取决于其中每一个元素的类型以及元组长度，因此表达式 ('a','b','c') 的类型为 (Char, Char, Char)。
表达式 4 == 5 总是返回 False，因此其类型为 Bool。

Haskell 中的函数同样有类型。

1
2
3

Prelude> removeNonUppercase st = [ c | c <- st, c `elem` ['A'..'Z']]
Prelude> :t removeNonUppercase
removeNonUppercase :: [Char] -> [Char]

removeNoneUppercase 函数的类型为 [Char] -> [Char]，说明该函数的参数类型为字符串，返回值类型为字符串。即函数的类型通过由 -> 符号分隔的参数与返回值的类型表示。

类型变量

函数的类型由参数和返回值表示，但是有些函数的参数与返回值的类型并不会固定为某一种。比如 head 函数可以获取列表中的第一个元素，而列表中元素的类型可能由很多种。

Prelude> head [1,2,3,4]
1
Prelude> head "hello"
'h'
Prelude> :t head
head :: [a] -> a

head :: [a] -> a 中的 a 即为类型变量，表示该参数或返回值可以是任意类型。包含类型变量的函数叫做多态函数。
除了 a 以外，其他如 b、c、d 等也可作为类型变量使用。像前面的 [a] -> a， a 可以表示任意类型，但两个 a 必定是同一类型。

Prelude> fst ("hello", True)
"hello"
Prelude> :t fst
fst :: (a, b) -> a

Typeclass

Typeclass 是一种定义了某些行为的接口。如果某个类型属于特定的 typeclass，则意味着该类型实现了由 typeclass 描述的行为，类似于 Java 中的 interface。

1 2	Prelude> :t (==) (==) :: Eq a => a -> a -> Bool

其中 => 符号前面的部分叫做类约束，可以这样理解：== 函数接收任意两个相同类型（a）的数值，根据其是否相等返回 Bool 值。两个输入参数的类型必须是 Eq 类的成员（因此叫类约束）。

Eq typeclass 为其成员类型提供了测试相等性的接口，任何可以用来比较是否相等的类型都应该是 Eq 类的成员。所有 Haskell 基本类型（除 IO 外）和函数都是 Eq typeclass 的一部分。

以下是一些基本的 typeclass：
Eq 用于类型之间的相等性测试，实现的函数有 == 和 /= 。

Prelude> 5 == 5
True
Prelude> 5 /= 5
False
Prelude> 'a' == 'a'
True
Prelude> "Ho Ho" == "Ho Ho"
True
Prelude> 3.432 == 3.432
True

Ord 用于拥有顺序的类型，包含所有基本的比较函数如 >、< 和 >= 等。

Prelude> :t (>)
(>) :: Ord a => a -> a -> Bool
Prelude> "Abrakadabra" < "Zebra"
True
Prelude> 5 >= 2
True

Show 的成员可以表示为字符串，最常用的用于处理 Show 成员的函数是 show，可以将某个类型的值转换为字符串表示：

Prelude> show 3
"3"
Prelude> show 5.334
"5.334"
Prelude> show True
"True"

Read 是和 Show 相反的一类 typeclass。read 函数可以接收字符串并返回属于 Read 的某个类型（自行推断或显示指定）：

Prelude> read "True" || False
True
Prelude> read "8.2" + 3.8
12.0
Prelude> read "[1,2,3,4]" ++ [3]
[1,2,3,4,3]
Prelude> read "[1,2,3,4]"
*** Exception: Prelude.read: no parse
Prelude> read "[1,2,3,4]" :: [Int]
[1,2,3,4]

Enum 的成员是有顺序的序列类型，可以被枚举。Enum 中的成员都可以使用 range 的方式生成列表，也都可以被自增或自减函数调用。

Prelude> ['a'..'e']
"abcde"
Prelude> [3..5]
[3,4,5]
Prelude> succ 'B'
'C'
Prelude> pred 'C'
'B'

Num 是一个数字类型的 typeclass。它的成员都具有数字类型的属性。

Prelude> :t 20
20 :: Num t => t
Prelude> 20 :: Float
20.0
Prelude> 20 :: Double
20.0
Prelude> :t (*)
(*) :: Num a => a -> a -> a

函数 * 的类型为 Num a => a -> a -> a，因此其参数必须是 Num typelcass 的成员，且必须是同一类型。

Prelude> 5 * (6 :: Float)
30.0
Prelude> (5 :: Int) * (6 :: Float)

<interactive>:56:15: error:
    • Couldn't match expected type ‘Int’ with actual type ‘Float’
    • In the second argument of ‘(*)’, namely ‘(6 :: Float)’
      In the expression: (5 :: Int) * (6 :: Float)
      In an equation for ‘it’: it = (5 :: Int) * (6 :: Float)

参考资料

Learn You a Haskell for Great Good!