Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
16bfd31
feat(ui): implement Label page for graph schema management
patchmemory Feb 4, 2026
2b418fd
Merge branch 'main' of github.com:patchmemory/scidk
patchmemory Feb 4, 2026
6b79a6c
feat(ui): implement Link page for relationship creation workflows
patchmemory Feb 4, 2026
797c7d8
chore(dev): update submodule pointer for completed task
patchmemory Feb 4, 2026
f605d53
fix(web): register route blueprints in scidk.web.create_app
patchmemory Feb 4, 2026
0409c83
fix(ui): add error handling for link preview when save fails
patchmemory Feb 4, 2026
4a8bd3c
fix(services): use per-request database connections in LabelService
patchmemory Feb 4, 2026
3d18b18
fix(services): start connection management refactor for LinkService
patchmemory Feb 4, 2026
ec88cd9
fix: complete LinkService connection management refactor
patchmemory Feb 4, 2026
60e2e0f
test: fix link E2E test isolation with unique names
patchmemory Feb 4, 2026
f4b9aa1
test: improve E2E test wait conditions for API calls
patchmemory Feb 4, 2026
5b3b7f8
test: fix E2E hanging with better wait conditions
patchmemory Feb 4, 2026
da07de5
test: simplify delete assertions with item iteration
patchmemory Feb 4, 2026
76ac25d
test: wait for wizard to hide after delete
patchmemory Feb 4, 2026
b70df22
fix: ensure link definitions always have valid UUIDs
patchmemory Feb 4, 2026
51ebf9b
fix: reset wizard to step 1 when loading a link definition
patchmemory Feb 4, 2026
3e0b559
feat: add automatic test scan cleanup after E2E tests
patchmemory Feb 4, 2026
4f208eb
fix: use static import for teardown function
patchmemory Feb 4, 2026
c2a28b7
feat(docs): add quickstart guide and RO-Crate demo script
patchmemory Feb 4, 2026
5a1957c
chore(dev): update submodule pointer for completed task
patchmemory Feb 4, 2026
6919d84
test(e2e): add navigation test and Neo4j conditional tests
patchmemory Feb 4, 2026
cb50f66
chore(dev): update submodule pointer for completed task
patchmemory Feb 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
220 changes: 220 additions & 0 deletions QUICKSTART.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
# SciDK Quickstart: Fresh Install to First RO-Crate

**Goal**: Get from zero to your first RO-Crate in under 30 minutes.

**Prerequisites**: Python 3.10+, git, and 5 minutes.

---

## 1. Install (5 minutes)

```bash
# Clone the repository
git clone https://github.com/yourusername/scidk.git
cd scidk

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate # bash/zsh
# or: source .venv/bin/activate.fish # fish shell

# Install SciDK in editable mode
pip install -e .

# Initialize environment (optional but recommended)
source scripts/init_env.sh
```

**Verify installation**:
```bash
scidk-serve --help
# Should show: usage: scidk-serve ...
```

---

## 2. Start the Server (1 minute)

```bash
# Start SciDK
scidk-serve
# or: python3 -m scidk.app
```

Server starts at: **http://127.0.0.1:5000**

Open in your browser and you should see the SciDK home page.

---

## 3. Scan Your First Directory (3 minutes)

### Via UI:
1. Navigate to **Files** page (http://127.0.0.1:5000/datasets)
2. Select provider: **Local Filesystem**
3. Enter a path (e.g., `/home/user/Documents` or use the repository root)
4. Check "Recursive" if you want subdirectories
5. Click **Scan Files**
6. Wait for scan to complete (progress shown in Background Tasks)

### Via API (alternative):
```bash
curl -X POST http://127.0.0.1:5000/api/scan \
-H "Content-Type: application/json" \
-d '{"path": "/path/to/your/data", "recursive": true}'
```

---

## 4. Browse Scanned Files (2 minutes)

After scanning completes:

1. **Files page** shows all discovered datasets
2. Click any dataset to see details:
- File metadata (size, type, timestamps)
- Interpreted content (for Python, CSV, JSON, YAML, IPYNB, XLSX)
- Import dependencies (for code files)

**API alternative**:
```bash
# List all scanned datasets
curl http://127.0.0.1:5000/api/datasets

# Get specific dataset details
curl http://127.0.0.1:5000/api/datasets/<dataset-id>
```

---

## 5. Select Files for RO-Crate (5 minutes)

Currently manual selection via browsing. For programmatic selection:

```bash
# Use search to find specific file types
curl "http://127.0.0.1:5000/api/search?q=csv"

# Filter by interpreter
curl "http://127.0.0.1:5000/api/search?q=python_code"
```

Mark interesting datasets mentally or via notes—RO-Crate packaging is next.

---

## 6. Create RO-Crate (5 minutes)

### Quick RO-Crate Generation:

For a scanned directory, generate a minimal RO-Crate:

```bash
# Generate RO-Crate JSON-LD for a directory
curl "http://127.0.0.1:5000/api/rocrate?path=/path/to/scanned/dir" > ro-crate-metadata.json
```

The RO-Crate will include:
- Root Dataset entity
- File/Folder entities (depth=1 by default)
- Contextual metadata per RO-Crate spec

### Via UI (if viewer embedding is enabled):
1. Set environment variable: `export SCIDK_FILES_VIEWER=rocrate`
2. Restart server
3. Files page will show **"Open in RO-Crate Viewer"** button
4. Click to view embedded crate metadata

---

## 7. Export RO-Crate as ZIP (5 minutes)

Create a complete RO-Crate package with data files:

```bash
# Using demo script (recommended)
./scripts/demo_rocrate_export.sh /path/to/scanned/dir ./my-crate.zip

# Manual steps:
# 1. Generate ro-crate-metadata.json (step 6)
# 2. Copy data files into crate directory
# 3. Zip the complete package
mkdir -p my-crate
curl "http://127.0.0.1:5000/api/rocrate?path=/path/to/dir" > my-crate/ro-crate-metadata.json
cp -r /path/to/dir/* my-crate/
zip -r my-crate.zip my-crate/
```

**Result**: `my-crate.zip` is a valid RO-Crate package containing:
- `ro-crate-metadata.json` (JSON-LD metadata)
- Data files from your scanned directory

---

## Verify Your RO-Crate (2 minutes)

```bash
# Unzip and inspect
unzip -l my-crate.zip
cat my-crate/ro-crate-metadata.json | jq '.@graph[] | select(.["@type"] == "Dataset")'

# Validate with ro-crate-py (optional)
pip install rocrate
python3 -c "from rocrate.rocrate import ROCrate; c = ROCrate('my-crate'); print(c.root_dataset)"
```

---

## Troubleshooting

### Port already in use
```bash
# Check what's using port 5000
lsof -i :5000

# Change port
export SCIDK_PORT=5001
scidk-serve
```

### Scan not finding files
- Verify the path exists and is readable
- Check recursive flag if scanning subdirectories
- Install `ncdu` for faster scanning: `brew install ncdu` (macOS) or `sudo apt install ncdu` (Linux)

### RO-Crate endpoint returns 404
- Ensure you're running the latest code from main branch
- Check that `/api/rocrate` endpoint is implemented (planned for v0.1.0)
- See `dev/features/ui/feature-rocrate-viewer-embedding.md` for implementation status

---

## Next Steps

**Explore more features**:
- **Map page** (http://127.0.0.1:5000/map): Visualize knowledge graph schema
- **Labels & Links**: Annotate files with custom labels and relationships
- **Providers**: Connect remote sources via rclone (S3, Google Drive, etc.)
- **Neo4j**: Enable persistent graph storage (see README § Neo4j integration)

**Documentation**:
- Full README: `/README.md`
- Development workflow: `dev/README-planning.md`
- RO-Crate feature spec: `dev/features/ui/feature-rocrate-viewer-embedding.md`

**Community**:
- Report issues: https://github.com/yourusername/scidk/issues
- Contributing: `CONTRIBUTING.md`

---

**Total time**: ~25 minutes from clone to packaged RO-Crate

**You're ready!** You've now:
- ✅ Installed SciDK
- ✅ Scanned a directory
- ✅ Browsed files and metadata
- ✅ Generated RO-Crate JSON-LD
- ✅ Exported a complete RO-Crate ZIP package

Happy crate-ing! 🎉
2 changes: 1 addition & 1 deletion dev
Submodule dev updated from 2dac9d to 37723a
38 changes: 38 additions & 0 deletions e2e/core-flows.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,41 @@ test('browse page shows correct file listing structure', async ({ page, baseURL,
// Cleanup
fs.rmSync(tempDir, { recursive: true, force: true });
});

test('navigation covers all 7 pages', async ({ page, baseURL }) => {
const base = baseURL || process.env.BASE_URL || 'http://127.0.0.1:5000';

// Start at home
await page.goto(base);
await page.waitForLoadState('networkidle');

// Define all pages with their nav test IDs, URLs, and expected titles
const pages = [
{ testId: 'nav-files', url: '/datasets', titlePattern: /Files|Datasets/i },
{ testId: 'nav-maps', url: '/map', titlePattern: /Map/i },
{ testId: 'nav-chats', url: '/chat', titlePattern: /Chat/i },
{ testId: 'nav-labels', url: '/labels', titlePattern: /Labels/i },
{ testId: 'nav-links', url: '/links', titlePattern: /Links/i },
{ testId: 'nav-settings', url: '/settings', titlePattern: /Settings/i },
];

for (const { testId, url, titlePattern } of pages) {
// Verify nav link is visible
const navLink = page.getByTestId(testId);
await expect(navLink).toBeVisible();

// Navigate
await navLink.click();
await page.waitForLoadState('networkidle');

// Verify page loads correctly
await expect(page).toHaveURL(new RegExp(url));
await expect(page).toHaveTitle(titlePattern);
}

// Test home navigation via logo
await page.getByTestId('nav-home').click();
await page.waitForLoadState('networkidle');
await expect(page).toHaveURL(base);
await expect(page).toHaveTitle(/SciDK/i);
});
23 changes: 23 additions & 0 deletions e2e/global-teardown.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import { FullConfig } from '@playwright/test';

// Import the teardown function from global-setup
import { teardown } from './global-setup';

export default async function globalTeardown(config: FullConfig) {
// Clean up test scans before shutting down server
const baseUrl = (process as any).env.BASE_URL;
if (baseUrl) {
try {
const response = await fetch(`${baseUrl}/api/admin/cleanup-test-scans`, {
method: 'POST',
});
const result = await response.json();
console.log('[cleanup] Test scans cleaned up:', result);
} catch (error) {
console.error('[cleanup] Failed to cleanup test scans:', error);
}
}

// Kill the server process
await teardown();
}
80 changes: 80 additions & 0 deletions e2e/labels.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -231,3 +231,83 @@ test('validation: cannot save label without name', async ({ page, baseURL }) =>
const value = await labelNameInput.inputValue();
expect(value).toBe('');
});

test('neo4j: push label to neo4j', async ({ page, baseURL, request: pageRequest }) => {
// Skip test if Neo4j is not configured
test.skip(!process.env.NEO4J_URI, 'NEO4J_URI not configured');

const base = baseURL || process.env.BASE_URL || 'http://127.0.0.1:5000';
await page.goto(`${base}/labels`);
await page.waitForLoadState('networkidle');

// Create a label first
await page.getByTestId('new-label-btn').click();
await page.getByTestId('label-name').fill('Neo4jTestLabel');

// Add a property
await page.getByTestId('add-property-btn').click();
const propertyRow = page.getByTestId('property-row').first();
await propertyRow.getByTestId('property-name').fill('id');
await propertyRow.getByTestId('property-type').selectOption('string');
await propertyRow.getByTestId('property-required').check();

// Save the label
await page.getByTestId('save-label-btn').click();
await page.waitForTimeout(1000);

// Verify Push to Neo4j button is visible
const pushBtn = page.getByTestId('push-neo4j-btn');
await expect(pushBtn).toBeVisible();

// Push to Neo4j
await pushBtn.click();
await page.waitForTimeout(2000);

// Wait for success toast (the push should succeed if Neo4j is connected)
// We can't easily check the toast content, but we can verify no errors occurred
// by checking that the page is still functional

// Verify label is still loadable
const labelItems = page.getByTestId('label-item');
await expect(labelItems.first()).toBeVisible();

// Cleanup: delete the test label
page.on('dialog', async (dialog) => await dialog.accept());
await labelItems.first().click();
await page.waitForTimeout(300);
await page.getByTestId('delete-label-btn').click();
await page.waitForTimeout(500);
});

test('neo4j: pull labels from neo4j', async ({ page, baseURL }) => {
// Skip test if Neo4j is not configured
test.skip(!process.env.NEO4J_URI, 'NEO4J_URI not configured');

const base = baseURL || process.env.BASE_URL || 'http://127.0.0.1:5000';
await page.goto(`${base}/labels`);
await page.waitForLoadState('networkidle');

// Click the "New Label" button to show the editor
await page.getByTestId('new-label-btn').click();

// Verify Pull from Neo4j button is visible
const pullBtn = page.getByTestId('pull-neo4j-btn');
await expect(pullBtn).toBeVisible();

// Set up dialog handler before clicking
page.on('dialog', async (dialog) => {
expect(dialog.type()).toBe('confirm');
expect(dialog.message()).toContain('Pull schema from Neo4j');
await dialog.accept();
});

// Click Pull from Neo4j
await pullBtn.click();
await page.waitForTimeout(2000);

// After pulling, labels should be loaded (if any exist in Neo4j)
// We can't guarantee any labels exist, but the operation should complete without error
// Verify the label list is still visible and functional
const labelList = page.getByTestId('label-list');
await expect(labelList).toBeVisible();
});
Loading