forked from explosion/spaCy
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.jade
158 lines (132 loc) · 6.71 KB
/
index.jade
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
//- 💫 LANDING PAGE
include _includes/_mixins
+landing-header
h1.c-landing__title.u-heading-0
| Industrial-Strength#[br]
| Natural Language#[br]
| Processing
h2.c-landing__title.o-block.u-heading-1
| in Python
+landing-badge(gh("spaCy") + "/releases/tag/v2.0.0-alpha", "v2alpha", "Try spaCy v2.0.0 alpha!")
+grid.o-content
+grid-col("third").o-card
+h(2) Fastest in the world
p
| spaCy excels at large-scale information extraction tasks.
| It's written from the ground up in carefully memory-managed
| Cython. Independent research has confirmed that spaCy is
| the fastest in the world. If your application needs to
| process entire web dumps, spaCy is the library you want to
| be using.
+button("/docs/api", true, "primary")
| Facts & figures
+grid-col("third").o-card
+h(2) Get things done
p
| spaCy is designed to help you do real work — to build real
| products, or gather real insights. The library respects
| your time, and tries to avoid wasting it. It's easy to
| install, and its API is simple and productive. I like to
| think of spaCy as the Ruby on Rails of Natural Language
| Processing.
+button("/docs/usage", true, "primary")
| Get started
+grid-col("third").o-card
+h(2) Deep learning
p
| spaCy is the best way to prepare text for deep learning.
| It interoperates seamlessly with
| #[+a("https://www.tensorflow.org") TensorFlow],
| #[+a("https://keras.io") Keras],
| #[+a("http://scikit-learn.org") Scikit-Learn],
| #[+a("https://radimrehurek.com/gensim") Gensim] and the
| rest of Python's awesome AI ecosystem. spaCy helps you
| connect the statistical models trained by these libraries
| to the rest of your application.
+button("/docs/usage/deep-learning", true, "primary")
| Read more
.o-inline-list.o-block.u-border-bottom.u-text-small.u-text-center.u-padding-small
+a(gh("spaCy") + "/releases")
strong.u-text-label.u-color-subtle #[+icon("code", 18)] Latest release:
| v#{SPACY_VERSION}
if LATEST_NEWS
+a(LATEST_NEWS.url) #[+tag.o-icon New!] #{LATEST_NEWS.title}
.o-content
+grid
+grid-col("two-thirds")
+terminal("lightning_tour.py").
# Install: pip install spacy && python -m spacy download en
import spacy
# Load English tokenizer, tagger, parser, NER and word vectors
nlp = spacy.load('en')
# Process a document, of any size
text = open('war_and_peace.txt').read()
doc = nlp(text)
# Hook in your own deep learning models
similarity_model = load_my_neural_network()
def install_similarity(doc):
doc.user_hooks['similarity'] = similarity_model
nlp.pipeline.append(install_similarity)
doc1 = nlp(u'the fries were gross')
doc2 = nlp(u'worst fries ever')
doc1.similarity(doc2)
+grid-col("third")
+h(2) Features
+list
+item Non-destructive #[strong tokenization]
+item Syntax-driven sentence segmentation
+item Pre-trained #[strong word vectors]
+item Part-of-speech tagging
+item #[strong Named entity] recognition
+item Labelled dependency parsing
+item Convenient string-to-int mapping
+item Export to numpy data arrays
+item GIL-free #[strong multi-threading]
+item Efficient binary serialization
+item Easy #[strong deep learning] integration
+item Statistical models for #[strong English] and #[strong German]
+item State-of-the-art speed
+item Robust, rigorously evaluated accuracy
.o-inline-list
+button("/docs/usage/lightning-tour", true, "secondary")
| See examples
.o-block.u-text-center.u-padding
h3.u-text-label.u-color-subtle.o-block spaCy is trusted by
each row in logos
+grid("center").o-inline-list
each details, name in row
+a(details[0])
img(src="/assets/img/logos/#{name}.png" alt=name width=(details[1] || 150)).u-padding-small
.u-pattern.u-padding
+grid.o-card.o-content
+grid-col("quarter")
img(src="/assets/img/profile_matt.png" width="280")
+grid-col("three-quarters")
+h(2) What's spaCy all about?
p
| By 2014, I'd been publishing NLP research for about 10
| years. During that time, I saw a huge gap open between the
| technology that Google-sized companies could take to market,
| and what was available to everyone else. This was especially
| clear when companies started trying to use my research. Like
| most researchers, my work was free to read, but expensive to
| apply. You could run my code, but its requirements were
| narrow. My code's mission in life was to print results
| tables for my papers — it was good at this job, and bad at
| all others.
p
| spaCy's #[+a("/docs/api/philosophy") mission] is to make
| cutting-edge NLP practical and commonly available. That's
| why I left academia in 2014, to build a production-quality
| open-source NLP library. It's why
| #[+a("https://twitter.com/_inesmontani") Ines] joined the
| project in 2015, to build visualisations, demos and
| annotation tools that make NLP technologies less abstract
| and easier to use. Together, we've founded
| #[+a(COMPANY_URL, true) Explosion AI], to develop data packs
| you can drop into spaCy to extend its capabilities. If
| you're processing Hindi insurance claims, you need a model
| for that. We can build it for you.
.o-block
+a("https://twitter.com/honnibal")
+svg("graphics", "matt-signature", 60, 45).u-color-theme