Skip to content

Latest commit

 

History

History
275 lines (253 loc) · 31.9 KB

README.md

File metadata and controls

275 lines (253 loc) · 31.9 KB

(C) 2024 Swudu Susuwu, dual licenses: choose GPLv2 or Apache 2 (allows all uses).

Table of Contents

Purposes

./.ssh/ is to compute signatures/certificates.

./posts/ stages posts (school classes) for https://SwuduSusuwu.SubStack.com/ about artificial neural tissue, antivirus, assistants, plus autonomous tools.

./c/ C implementations of posts (TODO, issue #3 which you can contribute to, or can request that more resources go to this task)

  • ./c/rfc6234/ is vendored code (direct from the official RFC6234), which is used for {classSha128(), classSha256(), classSha512()}.

./cxx/ C++ implementations of posts

  • ./cxx/Macros.hxx is
    • macros with wrap C++ features/attributes, such as {SUSUWU_ASSUME, SUSUWU_CONSTEXPR, SUSUWU_DEFAULT, SUSUWU_DELETE, SUSUWU_EXPECTS, SUSUWU_ENSURES, SUSUWU_FINAL, SUSUWU_IF_CPLUSPLUS, SUSUWU_NOEXCEPT, SUSUWU_NORETURN, SUSUWU_NULLPTR, SUSUWU_OVERRIDE, SUSUWU_STATIC_ASSERT, SUSUWU_UNREACHABLE} which (if used on old compilers, or with options such as -std=c++11) are replaced with no-ops or alternatives which have the same use,
    • macro options (which control the macro constants/macro functions). (View Options/setup for options),
    • macro constants, such as SUSUWU_SH_<color> (color = {DEFAULT, BLACK, DARK_GRAY, RED, LIGHT_RED, GREEN, LIGHT_GREEN, BROWN, YELLOW, BLUE, LIGHT_BLUE, PURPLE, LIGHT_PURPLE, CYAN, LIGHT_CYAN, LIGHT_GRAY, WHITE}, if supported, expands to the ANSI color codes, else expands to ""),
    • macro functions, such as {SUSUWU_ERROR, SUSUWU_WARNING, SUSUWU_INFO, SUSUWU_SUCCESS, which use SUSUWU_PRINT}, SUSUWU_PRINT (if __cplusplus, uses SUSUWU_CERR, else uses SUSUWU_STDERRR),
    • macroTestsNoexcept() (unit tests, with return value for errors).
  • ./cxx/ClassObject.hxx is
  • ./cxx/ClassPortableExecutable.hxx is
    • FilePath (PortableExecutable's constructor argument), FileBytecode (classSha2's input argument), FileHash (classSha2's return value)
    • class PortableExecutable : public Object (stores file path and/or bytecode and/or hexcode. TODO; hash?)
    • class PortableExecutableBytecode : public PortableExecutable loads bytecode from path. TODO; hash?
  • ./cxx/ClassSys.hxx is
    • typedefs {ClassSysUSeconds}
    • globals {classSysArgc, classSysArgs}
    • modular functions to interact with: console (Posix /bin/sh or Windows cmd``) {classSysGetConsoleInput(), classSysSetConsoleInput()}, own process {classSysInit(), classSysGetOwnPath(), classSysFopenOwnPath(), templateCatchAll()}, strings (or streams) {classSysHexOs(), classSysHexStr(), classSysColoredParamOs(), classSysColoredParamStr()}, the OS {classSysUSecondClock(), execvesFork(), execvexFork(), execves(), execvex(), classSysHasRoot(), classSysSetRoot(), classSysKernelCallback(), classSysKernelSetHook()}. TODO: filesystem (perhaps just have cxx/ClassPortableExecutable.hxx` do this?), internet.
    • classSysTests(), or classSysTestsNoexcept() (unit tests with exceptions for errors, or return value for errors).
  • ./cxx/ClassSha2.hxx is
    • the classSha2 function pointer, which defaults to classSha256() (but you can set classSha2 = sha128; or classSha2 = sha512;), wrapped around official RFC6234 code. ./cxx/ClassResultList.hxx, ./cxx/VirusAnalysis.cxx and ./cxx/AssistantCns.cxx all use classSha2.
    • classSha2Tests(), or classSha2TestsNoexcept() (unit tests with exceptions for errors, or return value for errors).
  • ./cxx/ClassCns.hxx is class Cns : public Object (abstract neural system class with pure virtuals.) Issue #6 is to implement this class.
  • ./cxx/ClassResultList.hxx is
    • class ResultList : public Object (holds hashes, signatures, bytecodes); resultList*() functions {resultListDumpTo(), resultListProduceHashes() (virusAnalysisTests() uses this)}.
    • modular template (can use on all containers such as std::vector, std::map or std::list) list*() functions (such as listMaxSize(), listDumpTo(), listToHashes(), listIntersections(), listsIntersect(), listFindValue(), listHasValue(), listFindSubstr(), listHasSubstr(), listProduceSignature() (produceAbortListSignatures uses this), listFindSignatureOfValue(), listHasSignatureOfValue() (signatureAnalysis() uses this), explodeToList (./cxx/AssistantCns.cxx uses this), produce unique signature, compare file against list of signatures), most of which were produced for antivirus signature analysis.
    • classResultListsTests(), or classResultListsTestsNoexcept() (unit tests with exceptions for errors, or return value for errors).
  • ./cxx/VirusAnalysis.hxx is
    • modular helper functions {produceAbortListSignatures() (for signatureAnalysis() use), importedFunctionsList() (work-in-progress, staticAnalysis() uses this), straceOutputsAnalysis() (work-in-progress, sandboxAnalysis() uses this), produceAnalysisCns() (for cnsAnalysis() use), produceVirusFixCns() (for cnsVirusFix() use)},
    • kernel hook function (virusAnalysisHook()), which uses virusAnalysis() to scan new downloadss (or scan all programs which execute); work-in-progress.
    • modular scan functions {hashAnalysis(), signatureAnalysis(), staticAnalysis() (processes Executable and Linkable Format or Portable Executables, work-in-progress), sandboxAnalysis() (executes with strace + chroot, work-in-progress), cnsAnalysis() (uses ClassCns.hxx)} plus disinfection function (cnsVirusFix()) which form an antivirus program.
    • virusAnalysisTests(), or virusAnalysisTestsNoexcept() (unit tests with exceptions for errors, or return value for errors).
  • ./cxx/AssistantCns.hxx is
    • modular functions {assistantCnsDownloadHosts() (uses wget on assistantCnsDefaultHosts), assistantCnsProcessXhtml() (uses the next 2 functions to process wget's downloads: assistantCnsProcessUrls (uses boost/property_tree/xml_parser.hpp to extract new URLs), assistantCnsProcessQuestion (work-in-progress, extracts question), assistantCnsProcessResponses() (work-in-progress, extracts answers)), produceAssistantCns() (uses datasets for backpropagation), assistantCnsProcess (uses forwardpropagation to answer new questions)} which form an assistant.
    • assistantCnsTests(), or assistantCnsTestsNoexcept() (unit tests with exceptions for errors, or return value for errors).
  • ./cxx/main.hxx is SusuwuUnitTestsBitmask main() (executes all of those *TestsNoexcept() unit tests into a bitmask return value.) All have lots of issues which you can contribute to, or can request that more resources go to).

./Macros.sh is a modular ./bin/sh script with functions which ./build.sh uses.

./build.sh does what ./configure, make often do. (View Options/setup for options).

How to use this

Minimum requirements (build targets which this supports):

Download

Download source with git clone https://github.com/SwuduSusuwu/SubStack.git. If this does not have all the tools you want, you can opt-in to the beta with git switch experimental (opt-out with get switch trunk).

Signature/certificate

./.ssh/setup.sh is to setup gpg.ssh.allowedSignersFile (allows to use git verify <ref> or git log --show-signature).

[Notice: This public crypto "signature", is not related to "signature analysis" (Substr scans).]

Options/setup

Usage: ./build.sh [OPTIONS] produces objects (./obj/*.o, for distribution into other tools,) plus Executable and Linkable Format (./bin/Susuwu.out, to do examples/unit tests which prove how effective functions execute,) both of which you can redirect with export OBJDIR=___ (or export BINDIR=___.)

  • ./cxx/main.hxx has constants to use to interpret Susuwu.out's return values.
  • Environment flags: as GNU make's.
  • Console flags:
    • ./build.sh : Defaults to ./build.sh --debug. For all source code, if intermediate object doesn't exist or is older than source, builds source.
    • ./build.sh --clean : removes intermediate object files + exits; to reduce disc use.
    • ./build.sh --rebuild : removes intermediate object files + continues; to rebuild with new flags (or if ./build.sh doesn't rebuild code which includes updated headers).
    • ./build.sh --debug : includes frame-pointers/debug symbols (-g), includes valgrind-replacement tools (such as -fsanitize=address), optimizes with -Og.
    • ./build.sh --release : excludes --debug (-DNDEBUG), strips frame-pointers/symbols, optimizes with -O2.
    • ./build.sh --mingw : can mix with --release or --debug. Produces Portable Executable (./bin/Susuwu.exe), for Windows.
  • Macro flags (use vim build.sh to put into FLAGS_USER). If =true, most use more resources, except SUSUWU*PREFER_* or SUSUWU*SKIP_*. "default is =!defined(NDEBUG)" is short for; "if --debug, default =true, but if --release, default =false".
    • -DSUSUWU_UNIT_TESTS[=true|=false] with =true to build + execute unit tests. Default is =true, but more stable future version could have default =!defined(NDEBUG). If set to =false; compilation time, object size, execuable size reduced (to around half).
    • Custom sh (console) output:
      • -DSUSUWU_SH_PREFER_STDIO=true to replace std::cXXX << ... with fprintf(stdXXX, ...); default is =!defined(__cplusplus).
      • -DSUSUWU_SH_VERBOSE[=true|=false] with =true to print diagnostic messages (SUSUWU_SH_USE_FILE, SUSUEU_SH_USE_LINE, SUSUWU_NOTICE, SUSUWU_DEBUG, SUSUWU_DEBUGEXECUTE, SUSUWU_NOTICE_EXECUTE, SUSUWU_DEBUG_EXECUTE all use #if SUSUWU_SH_VERBOSE); default is =!defined(NDEBUG).
      • -DSUSUWU_SH_SKIP_BRACKETS=true sets output format to WARN_LEVEL: message; default is =false.
      • -DSUSUWU_SH_FILE=true sets output format to [__FILE__: WARN_LEVEL: message]; default is =SUSUWU_SH_VERBOSE.
      • -DSUSUWU_SH_LINE=true sets output format to [__LINE__: WARN_LEVEL: message]; default is =SUSUWU_SH_VERBOSE.
      • -DSUSUWU_SH_FUNC=true sets output format to [__func__: WARN_LEVEL: message]; default is =false.
      • -DSUSUWU_SH_SKIP_COLORS=true to omit VT100 (ANSI) colors; default is =defined(SUSUWU_SH_COLORS_UNSUPPORTED)).
      • -DSUSUWU_SH_SKIP_COLORS=false to force (even if unsupported) VT100 (ANSI color) use.
      • TODO (for now, no effect; once issue #17 is closed, you can use):
        • -DSUSUWU_SH_RUNTIME_OSC to replace #ifdef _POSIX_VERSION\nAccessClipboard();\n#endif with termcmp./GetConsoleMode() (for choices on whether or not to use Operating System Commands); default is undefined.
        • -DSUSUWU_SH_RUNTIME_COLORS to replace #if _POSIX_VERSION\nColors();\n#endif with termcmp./GetConsoleMode() (for choices on whether or not to use colors); default is undefined.
    • To match g++./clang++ console format, use -DSUSUWU_SKIP_BRACKETS=true, -DSUSUWU_SH_FILE=true, -DSUSUWU_SH_LINE=true, -DSUSUWU_SH_FUNC=false (sets output format to __FILE__:__LINE__: WARN_LEVEL: message).
    • Unstable/experimental flags:
      • -DSUSUWU_EXPERIMENTAL to enable experimental (more new, but unfinished/unstable) versions of code; default is unset, unless git switch experimental is executed.
        • -DSUSUWU_DEFAULT_BRANCH if errors, suggest git switch SUSUWU_DEFAULT_BRANCH; default is "trunk".
      • -DSUSUWU_VIRTUAL_OPERATORS_USE_VPTRS=false: ./cxx/ClassObject.hxx:Class::operator==(const Class &obj) { return this->hasLayoutOf(obj) && 0 == memcmp(sizeof(NULL) + (char *)this, sizeof(NULL) + (char *)&obj, this->getObjectSize() - sizeof(NULL)); }, thus Susuwu::Object() == Susuwu::Class() but CXX output with nonstandard vptr layout crashes. Default =true; (return typeid(this) == typeid(obj) && 0 == memcmp(this, *obj, this->getObjectSize());).
      • -DSUSUWU_VIRTUAL_EQUALS_USE_ADDRESSES=false: to use ./cxx/ClassObject.hxx:Object::equals(const Object &obj) { return this->operator==(obj); }. Default is =true (return this == &obj). For now, just controls Object::equals (in future, perhaps SUSUWU_VIRTUAL_OPERATORS_USE_ADDRESSES inherits this).
    • TODO (for now won't build, or has no effect):
      • -DSUSUWU_VIRTUAL_OPERATORS_USE_ADDRESSES=true: No effect. If implemented, Class::operator==(const Class &obj) { return &obj == this; }. Default is =false.
      • -DSUSUWU_PREFER_CSTR=true to replace std::string with char * (more compatible with non-C++ projects); default is =SUSUWU_PREFER_C.
      • -DSUSUWU_PREFER_C=true sets SUSUWU_PREFER_CSTR + SUSUWU_SH_PREFER_STDIO (plus other flags which will exist to allow non-C++ projects to include this; default is =!defined(__plusplus)).

How to contribute

View documented issues (for ideas on code to contribute, plus so you do not report documented issues.)

Beta test/experimental builds

  • git switch experimental && ./build.sh
    • View results for symptoms of new issues (hint: look for "Warning:"s or "Error:"s).
    • If you found new issue(s) (which aren't due to misconfigurations in your system), post new issue(s).

Contributor conventions/rules

If your commit introduces/removes functions, have ./README.md#purposes include this. So that code is consistant, pull requests have language-specific syntax rules:

git

Do atomic commits: if you cannot ./build.sh your commit if it is swapped (such as through git rebase -i) with a previous commit, or cannot ./build.sh if a previous commit got git revert, your commit message must include such as "Is followup to <commit hash>" (which shows temporal order).

git commit message format/syntax:

  • affix "()" onto functions (regardless of number of arguments), such as function(), or use the function name (such as function) alone.
  • if commit does git add NewFile: message has +\NewFile``.
  • if git rm Exists: -\Exists``.
  • if touch Exists && git add Exists: @\Exists`or?`Exists``.
    • Simple wildcards/regex for altered functions: \%s/oldFunction()/newFunction()/``.
    • '' is not used as update prefix, since '' has much other use in Regex (wildcards) & C++ (such as block comments, dereferences, or math).
      • From the root commit through 159940fb8b60b176a38a13cdfbd9393596daa9b5 (Date: Thu Jul 4 07:56:01 2024 -0700), '@' was the prefix for updates. From then until this commit, '?' was the prefix for updates.
      • From this commit on (this is the successor to commit 0ae6233c02d9e04fca60027b1e32b885eb69bb8a (Date: Sat Nov 30 17:50:40 2024 -0800)), '@' is (once more) the prefix for updates, due to: it is more common for projects to so use '@'.
  • if echo "int newFunction() {...}" >> Exists && git add Exists: @\Exists`:+`NewFunction()``.
  • if git mv OldPath/ NewPath/: \OldPath/` -> `NewPath/`ormv OldPath/ NewPath/`.
  • as default branch, choose master, main or trunk (do not have more than 1 of those branches, or ./Macros.sh:SUSUWU_DEFAULT_BRANCH() is ambiguous).
  • to indent: use tabs to form blocks, such as:
?`README.md`:
	?`#How-to-use-this`:
	Split into:
		+`## Download`: new; howto clone, howto switch branches.
		+`## Optionssetup`: "Options/setup"; howto use `./build.sh` (with or without options.)
	?`#How-to-contribute`,
	?[Good first issues to contribute to]: (moved into `#How-to-contribute`)

/[Notice: Commit titles can omit backticks (``) if not enough room; the backticks just allow GitHub to do Markdown-format code/paths.]

sh source

Is as for C/C++ source, plus specifics to sh:

C/C++ source

Linter: apt install clang && clang-tidy cxx/*.cxx (defaults to .clang-tidy options).

Most of what Mozilla Org's (Firefox's) style suggests is sound (you should follow this unless you have specific reasons not to).

Code rules (lots overlap with Mozilla Org's):

  • Indent: tabs ('^I'); as much tabs as braces ('{' = +1 tab, '}' = -1 tab). [All which conflicts with Mozilla Org's format is tab use.]

    • Rationales: reduced memory use, allows local configs (such as ~/.vimrc) to set width, allows arrow keys to move fast.
  • Files: ${C_SOURCE_PATH}/PascalCase.* or ${CXX_SOURCE_PATH}/PascalCase.*xx (such as: ./cxx/ClassFoo.hxx, used as #include "ClassFoo.hxx" /* classFooFunction() */), as this is most common.

    • ./build.sh requires: that all local includes prefix as Class*.hxx (so it knows to execute --rebuild if you upgrade a common include.) TODO: incremental builds which don't require this.
    • To assist with insertion/removal of #include statements, comments shall list all functions/macros used.
  • Structs, enums, classe: typedef struct PascalCase {} PascalCase;, typedef enum PascalCase {} PascalCase;, typedef class PascalCase {} PascalCase;, as this is most common.

  • Macros: #define NAMESPACE_CONSTANT_CASE(snake_case_param) assert(snake_case_param);, as this is most common.

  • Braces, functions:

    • Do not produce lots of functions with the same name but different arguments, as such "overloads" make this difficult to port.
    • Single statement blocks can use the form: virtual bool hasInstance() { return true; }.
    • Most common form:
const /* const prevents `if(func() = x)` where you wished for `if(func() == x)` */ bool classPrefixCamelCase(bool s, bool x) {
	if(s && x) {
		return s;
	} else {
		return x;
	}
}
  • Args/params, local variables, objects: const bool camelCase = true; Global variables/objects: extern const bool classFooFunction;, as this is most common.

    • Functions/globals can omit "classFoo" if wrapped as namespace ClassFoo { extern bool global; };, or (for class ClassFoo { static bool global; };).
      • The project as a whole should have namespace Susuwu {};, but you can nest namespaces.)
  • For error, warning or notice syntax use: "[WARN_LEVEL: OPTIONAL_FUNCTION_NAME {code which triggered the error/warning/diagnostic/notice} /* OPTIONAL COMMENTS */]",

    • ./cxx/Macros.hxx: {SUSUWU_STR(x), SUSUWU_CERR(x), SUSUWU_STDOUT(x)} have the new syntax for this.
    • Modules should choose one of this list to send all userland errors, warnings or notices to:
      • throw std::runtime_error(message) (or throw Q() where class Q : public std::exception)
      • std::cerr << message (or fprintf(stdout, message)
      • errno = code;, or return code;.
  • Comments

    • Comments about possible return code;s (or throws) go above function declarations (Doxygen convention).
      • It is arguable whether or not you should document all possible system errors; most Standard Template Library functions can throw.
    • *.hxx is to document interfaces (above function declarations); *.cxx is to do implementations (do not duplicate interface comments).
    • Use Markdown. Rationales: for non-English users (or for computers), such syntax assists use.
    • Instead of //new single-line comments, prefer /* old fashioned */.
      • Rationale: simpler to port. More obvious where the comment stops if the comment wraps around.
    • Doxygen-ish "@pre"/"@post" prepares for C++26 Contracts:
      • As soon as clang++ (or g++) has Contracts (part of C++26), regular expressions (such as :%s/@pre (.*) @code (.*) @endcode/[[expects: \2]] \\* \1 \\*/ :%s/@post (.*) @code (.*) @endcode/[[ensures: \2]] \\* \1 \\*/) can convert Doxygen-ish comments into contracts.
      • ./cxx/Macros.hxx has SUSUWU_ASSUME(X), which is close to [[expects: x]], but SUSUWU_ASSUME(X) goes to *.cxx, whereas [[expects]] goes to *.hxx.
      • Advantages of [[expects]]: allows to move information of interfaces out of *.cxx, to *.hxx.
/* @throw std::bad_alloc If function uses {`malloc`, `realloc`, `new[]`, `std::*::{push_back, push_front, insert}`}
 * @throw std::logic_error Optional. Would include most functions which use `std::*`
 * @pre @code !output.full() @endcode
 * @post @code !output.empty() @endcode
 */
bool functionDeclaration(std::string input, std::deque<vector> output);
  • Include guards:
#ifndef INCLUDES_Path_To_File
#define INCLUDES_Path_To_File
#endif /* ndef INCLUDES_Path_To_File */

Sponsor

To sponsor this, you can send withdrawable crypto (such as Bitcoin) addresses to contacts which ./SECURITY.md lists.

Escrow

If you want proof that your crypto/cash will go to produce specific systems, use escrow services (send the escrow crypto/cash plus contract an open issue which you choose).