-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port variable-length-quantity exercise. #960
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Instructions | ||
|
||
Implement variable length quantity encoding and decoding. | ||
|
||
The goal of this exercise is to implement [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) encoding/decoding. | ||
|
||
In short, the goal of this encoding is to encode integer values in a way that would save bytes. | ||
Only the first 7 bits of each byte is significant (right-justified; sort of like an ASCII byte). | ||
So, if you have a 32-bit value, you have to unpack it into a series of 7-bit bytes. | ||
Of course, you will have a variable number of bytes depending upon your integer. | ||
To indicate which is the last byte of the series, you leave bit #7 clear. | ||
In all of the preceding bytes, you set bit #7. | ||
|
||
So, if an integer is between `0-127`, it can be represented as one byte. | ||
Although VLQ can deal with numbers of arbitrary sizes, for this exercise we will restrict ourselves to only numbers that fit in a 32-bit unsigned integer. | ||
Here are examples of integers as 32-bit values, and the variable length quantities that they translate to: | ||
|
||
```text | ||
NUMBER VARIABLE QUANTITY | ||
00000000 00 | ||
00000040 40 | ||
0000007F 7F | ||
00000080 81 00 | ||
00002000 C0 00 | ||
00003FFF FF 7F | ||
00004000 81 80 00 | ||
00100000 C0 80 00 | ||
001FFFFF FF FF 7F | ||
00200000 81 80 80 00 | ||
08000000 C0 80 80 00 | ||
0FFFFFFF FF FF FF 7F | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"authors": [], | ||
"files": { | ||
"solution": [], | ||
"test": [], | ||
"example": [] | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
[canonical-tests] | ||
|
||
# zero | ||
"35c9db2e-f781-4c52-b73b-8e76427defd0" = true | ||
|
||
# arbitrary single byte | ||
"be44d299-a151-4604-a10e-d4b867f41540" = true | ||
|
||
# largest single byte | ||
"ea399615-d274-4af6-bbef-a1c23c9e1346" = true | ||
|
||
# smallest double byte | ||
"77b07086-bd3f-4882-8476-8dcafee79b1c" = true | ||
|
||
# arbitrary double byte | ||
"63955a49-2690-4e22-a556-0040648d6b2d" = true | ||
|
||
# largest double byte | ||
"29da7031-0067-43d3-83a7-4f14b29ed97a" = true | ||
|
||
# smallest triple byte | ||
"3345d2e3-79a9-4999-869e-d4856e3a8e01" = true | ||
|
||
# arbitrary triple byte | ||
"5df0bc2d-2a57-4300-a653-a75ee4bd0bee" = true | ||
|
||
# largest triple byte | ||
"f51d8539-312d-4db1-945c-250222c6aa22" = true | ||
|
||
# smallest quadruple byte | ||
"da78228b-544f-47b7-8bfe-d16b35bbe570" = true | ||
|
||
# arbitrary quadruple byte | ||
"11ed3469-a933-46f1-996f-2231e05d7bb6" = true | ||
|
||
# largest quadruple byte | ||
"d5f3f3c3-e0f1-4e7f-aad0-18a44f223d1c" = true | ||
|
||
# smallest quintuple byte | ||
"91a18b33-24e7-4bfb-bbca-eca78ff4fc47" = true | ||
|
||
# arbitrary quintuple byte | ||
"5f34ff12-2952-4669-95fe-2d11b693d331" = true | ||
|
||
# maximum 32-bit integer input | ||
"7489694b-88c3-4078-9864-6fe802411009" = true | ||
|
||
# two single-byte values | ||
"f9b91821-cada-4a73-9421-3c81d6ff3661" = true | ||
|
||
# two multi-byte values | ||
"68694449-25d2-4974-ba75-fa7bb36db212" = true | ||
|
||
# many multi-byte values | ||
"51a06b5c-de1b-4487-9a50-9db1b8930d85" = true | ||
|
||
# one byte | ||
"baa73993-4514-4915-bac0-f7f585e0e59a" = true | ||
|
||
# two bytes | ||
"72e94369-29f9-46f2-8c95-6c5b7a595aee" = true | ||
|
||
# three bytes | ||
"df5a44c4-56f7-464e-a997-1db5f63ce691" = true | ||
|
||
# four bytes | ||
"1bb58684-f2dc-450a-8406-1f3452aa1947" = true | ||
|
||
# maximum 32-bit integer | ||
"cecd5233-49f1-4dd1-a41a-9840a40f09cd" = true | ||
|
||
# incomplete sequence causes error | ||
"e7d74ba3-8b8e-4bcb-858d-d08302e15695" = true | ||
|
||
# incomplete sequence causes error, even if value is zero | ||
"aa378291-9043-4724-bc53-aca1b4a3fcb6" = true | ||
|
||
# multiple values | ||
"a91e6f5a-c64a-48e3-8a75-ce1a81e0ebee" = true | ||
|
||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: variable-length-quantity | ||
version: 0.0.0.1 # TODO: what should this number be? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Example solutions don't have a version number. |
||
|
||
dependencies: | ||
- base | ||
|
||
library: | ||
exposed-modules: Vlq | ||
source-dirs: src | ||
ghc-options: -Wall | ||
dependencies: | ||
- mtl | ||
|
||
tests: | ||
test: | ||
main: Tests.hs | ||
source-dirs: test | ||
dependencies: | ||
- variable-length-quantity | ||
- hspec | ||
- text | ||
- QuickCheck |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,66 @@ | ||||||
{-# LANGUAGE BinaryLiterals #-} | ||||||
{-# LANGUAGE FlexibleContexts #-} | ||||||
{-# LANGUAGE LambdaCase #-} | ||||||
{-# LANGUAGE NumericUnderscores #-} | ||||||
|
||||||
module Vlq | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have the impression that one word module name seems to be preferred on exercises that have long names, but I couldn't find a good one, |
||||||
( encodes | ||||||
, decodes | ||||||
, DecodeError (..) | ||||||
) | ||||||
where | ||||||
|
||||||
import Control.Monad | ||||||
import Control.Monad.Except | ||||||
import Control.Monad.State.Strict | ||||||
import Data.Bits | ||||||
import Data.List | ||||||
import Data.Word | ||||||
|
||||||
data DecodeError | ||||||
= IncompleteSequence | ||||||
| TooManyBits | ||||||
deriving (Show, Eq) | ||||||
|
||||||
encodeOne :: Word32 -> [Word8] | ||||||
encodeOne 0 = [0] | ||||||
encodeOne x = reverse . unfoldr go $ (x, True) | ||||||
where | ||||||
go (cur, fstOctet) = do | ||||||
guard $ cur /= 0 | ||||||
let (q, r) = cur `quotRem` 0b1000_0000 | ||||||
r' = fromIntegral $ if fstOctet then r else r .|. 0b1000_0000 | ||||||
pure (r', (q, False)) | ||||||
|
||||||
encodes :: [Word32] -> [Word8] | ||||||
encodes = concatMap encodeOne | ||||||
|
||||||
decodeOne :: MonadError DecodeError m => [Word8] -> m Word32 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very nice to have something like this in an example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Something needs to be addressed in this part? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it's perfectly fine. |
||||||
decodeOne xs = do | ||||||
let l = length xs | ||||||
when (l == 0 || l > 5) $ | ||||||
throwError IncompleteSequence | ||||||
when (l == 5 && head xs > 0b1000_1111) $ | ||||||
throwError TooManyBits | ||||||
pure $ foldl (\acc x -> (acc `unsafeShiftL` 7) .|. (fromIntegral x .&. 0b0111_1111)) 0 xs | ||||||
|
||||||
mayDecodeNext :: (MonadState [Word8] m, MonadError DecodeError m) => m (Maybe Word32) | ||||||
mayDecodeNext = | ||||||
get >>= \case | ||||||
[] -> pure Nothing | ||||||
st | ||||||
| (highs, rest) <- span ((/= 0) . (.&. 0b1000_0000)) st -> | ||||||
Just | ||||||
<$> case rest of | ||||||
[] -> throwError IncompleteSequence | ||||||
low : rest' -> do | ||||||
put rest' | ||||||
decodeOne (highs <> [low]) | ||||||
|
||||||
decodes :: [Word8] -> Either DecodeError [Word32] | ||||||
decodes = evalState (runExceptT decodeAll) | ||||||
where | ||||||
decodeAll = | ||||||
mayDecodeNext >>= \case | ||||||
Nothing -> pure [] | ||||||
Just x -> (x :) <$> decodeAll |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
name: variable-length-quantity | ||
version: 0.0.0.1 # TODO: what should this number be? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This question cannot currently be answered. This track's exercise versioning policy is described in the track README: https://github.com/exercism/haskell#exercise-versioning Which originally meant "Go to the exercise's canonical-data.json and look up the Some of the technical argumentation refers to automated test generators:
Since the Haskell track does not employ an automated test generator, the test generator is a person who does use the latest version, but does so manually. This reasoning does not seem to apply to this track as long as we manually maintain test files.
tl;dr: Canonical versions were removed, and our exercise versioning policy depends on them, so we need to make a new versioning policy. |
||
|
||
dependencies: | ||
- base | ||
|
||
library: | ||
exposed-modules: Vlq | ||
source-dirs: src | ||
ghc-options: -Wall | ||
# dependencies: | ||
# - foo # List here the packages you | ||
# - bar # want to use in your solution. | ||
|
||
tests: | ||
test: | ||
main: Tests.hs | ||
source-dirs: test | ||
dependencies: | ||
- variable-length-quantity | ||
- hspec | ||
- text | ||
- QuickCheck |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
module Vlq | ||
( encodes | ||
, decodes | ||
, DecodeError (..) | ||
) | ||
where | ||
|
||
import Data.Word | ||
|
||
data DecodeError | ||
= IncompleteSequence | ||
| TooManyBits | ||
deriving (Show, Eq) | ||
|
||
encodes :: [Word32] -> [Word8] | ||
encodes = error "You need to implement this function." | ||
|
||
decodes :: [Word8] -> Either DecodeError [Word32] | ||
decodes = error "You need to implement this function." |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
resolver: lts-16.21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that you are more informed than me about the purpose of .meta/tests.toml.
I'd have to get back to you about this, or perhaps you can link to where the format and purpose reads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't either, but many other exercise has this file and its content seems straightforward to follow.