pkgs/sternenseemann/tep: init

pkgs/sternenseemann/emoji-generic: init at unstable-2020-11-22
author: sternenseemann <0rpkxez4ksa01gb3typccl0i@systemli.org> 2021-02-05 11:01:43 +0100
committer: sternenseemann <0rpkxez4ksa01gb3typccl0i@systemli.org> 2021-02-06 20:15:36 +0100
commit: 0960a4353bb38096e71afe3fa2e3ed89dc612117 (patch)
tree: efb0c190e39b4068ec913586ef297f9eda6b458c /pkgs/sternenseemann/tep
parent: b1a493ed51858fdbd7b24f844782157fa83e5319 (diff)
3 files changed, 153 insertions, 0 deletions
diff --git a/pkgs/sternenseemann/tep/README.md b/pkgs/sternenseemann/tep/README.md
new file mode 100644
index 00000000..71fd08d9
--- /dev/null
+++ b/pkgs/sternenseemann/tep/README.md
@@ -0,0 +1,81 @@
+# tep — tep emoji picker
+
+`tep` is a simple (maybe even trivial) emoji picker.
+It lets you search for emojis using [`bemenu`](https://github.com/Cloudef/bemenu)
+and the selected one then can be copied to your clipboard
+or printed to `stdout`.
+
+It's main strength is that the list of emojis is generated
+(at build time) from `emoji-test.txt`, a file provided by
+the Unicode Consortium for every emoji version. It describes
+what the Unicode Consortium thinks should be a emoji keyboard.
+See [Technical Standard #51](https://unicode.org/reports/tr51/index.html#Data_Files)
+for a (pretty unenlightening) description of the emoji data
+files.
+
+This way we get a list of emojis associated with their CLDR
+short name and their group and subgroup in the keyboard
+(e. g. Travel & Transport → transport-air).
+
+The list includes the following types of emojis or rather
+character sequences:
+
+* fully qualified and minimally qualified emojis: these are
+  what you would generally consider an emoji: a character
+  or sequence of characters that show up as a single emoji
+  like 🥺 and 👩🏿‍🦰.
+* unqualified emojis: are characters which don't have a
+  emoji representation, usually it is required to use them
+  as part of a emoji presentation sequence to force them
+  to be rendered as an emoji. Probably will be filtered
+  out in the future.
+* components: These change other emojis by changing their
+  skin tone or hair etc.
+
+## usage
+
+```
+tep [copy] [bemenu args]
+```
+
+* `copy` is the first argument, `tep` will copy the emoji
+  to clip board, otherwise its printed to `stdout`
+* any other arguments are passed to `bemenu`.
+
+```
+tep copy -l 25 -i
+```
+
+Shows you 25 emojis vertically at once and lets you search case
+insensitively through it and will copy it to your clipboard
+as soon as you hit enter.
+
+## tep-data
+
+`tep-data` is the core component of `tep`. It is a small Haskell
+program which reads `emoji-test.txt` from `stdin` and converts it
+to a searchable format for `bemenu`. To improve performance and to
+limit run time dependencies it is only run at build time.
+
+It uses the `emoji-test.txt` parser from [`emoji-generic`](../../lib/hs/emoji-generic)
+apart from that it depends on `bytestring`, `text`, `attoparsec` and
+`utf8-light`. If you want to use it separately, it can be built
+from `tepData.hs`.
+
+The format is not stable, but it's guaranteed that each line starts
+with the emoji it describes which is terminated by an ascii space
+(`0x20`).
+
+## building / configuration
+
+`tep` is currently built with `nix`. For default settings
+run `nix-build -A tep` from the root of this repository.
+
+The derivation can be called with the following optional
+arguments to customize its behavior:
+
+* `copy`: command to call to copy the emoji to the clipboard,
+  defaults to `wl-copy --trim-newline` for wayland support
+* `fromTep`: command to use to extract the emoji from the
+  tep data line, defaults to `cut -d' ' -f1`
+* `emojiTestTxt`: derivation for `emoji-test.txt` to be used
diff --git a/pkgs/sternenseemann/tep/default.nix b/pkgs/sternenseemann/tep/default.nix
new file mode 100644
index 00000000..9778d536
--- /dev/null
+++ b/pkgs/sternenseemann/tep/default.nix
@@ -0,0 +1,30 @@
+{ writeHaskell, writeBashBin, runCommandLocal
+, emoji-generic, utf8-light, attoparsec, text, bytestring
+, bemenu
+, fromTep ? "cut -d' ' -f1"
+, copy ? "wl-copy --trim-newline"
+, emojiTestTxt
+}:
+
+let
+  tepData = writeHaskell "tep-data" {
+    libraries = [ emoji-generic utf8-light attoparsec text bytestring ];
+  } ./tepData.hs;
+  emojis = runCommandLocal "emojis.txt" {} ''
+    ${tepData} < ${emojiTestTxt} > "$out"
+  '';
+in
+
+writeBashBin "tep" ''
+  copy=false
+  if [[ "$1" = "copy" ]]; then
+    copy=true
+    shift
+  fi
+  ${bemenu}/bin/bemenu $@ < ${emojis} | ${fromTep} | \
+  if $copy; then
+    ${copy}
+  else
+    cat
+  fi
+''
diff --git a/pkgs/sternenseemann/tep/tepData.hs b/pkgs/sternenseemann/tep/tepData.hs
new file mode 100644
index 00000000..35a69319
--- /dev/null
+++ b/pkgs/sternenseemann/tep/tepData.hs
@@ -0,0 +1,42 @@
+{-# LANGUAGE OverloadedStrings #-}
+module Main where
+
+import Text.Emoji.DataFiles.EmojiTest
+
+import Codec.Binary.UTF8.Light (w2c)
+import qualified Data.Attoparsec.Text as A
+import qualified Data.ByteString as B
+import Data.Text (Text ())
+import qualified Data.Text as T
+import qualified Data.Text.Encoding as T
+import Data.Word (Word32)
+
+parseEmojiTest :: Text -> Either String EmojiTest
+parseEmojiTest = A.parseOnly emojiTestFile
+
+reportError :: Either String a -> IO a
+reportError (Left e)  = fail e
+reportError (Right x) = pure x
+
+outputTepFormat :: EmojiTest -> IO ()
+outputTepFormat = mapM_ (outputEntry [])
+  where outputEntry p (Comment _) = pure ()
+        outputEntry p (Group _ name es) =
+          mapM_ (outputEntry (p ++ [name])) es
+        outputEntry p (Entry codes _ _ name) =
+          outputEmoji codes name p
+
+codesText :: [Word32] -> Text
+codesText = T.pack . map w2c
+
+outputEmoji :: [Word32] -> Text -> [Text] -> IO ()
+outputEmoji codes name path = B.putStr . T.encodeUtf8
+  $  codesText codes
+  <> " " <> name <> " ("
+  <> T.intercalate " → " path <> ")\n"
+
+readStdin :: IO Text
+readStdin = T.decodeUtf8 <$> B.getContents
+
+main :: IO ()
+main = readStdin >>= reportError . parseEmojiTest >>= outputTepFormat
author	sternenseemann <0rpkxez4ksa01gb3typccl0i@systemli.org>	2021-02-05 11:01:43 +0100
committer	sternenseemann <0rpkxez4ksa01gb3typccl0i@systemli.org>	2021-02-06 20:15:36 +0100
commit	0960a4353bb38096e71afe3fa2e3ed89dc612117 (patch)
tree	efb0c190e39b4068ec913586ef297f9eda6b458c /pkgs/sternenseemann/tep
parent	b1a493ed51858fdbd7b24f844782157fa83e5319 (diff)