2024-07 |
SHREG: Mitigating register redundancy in GPUs |
Journal of Systems Architecture
|
2024-03 |
REPrune: Channel Pruning via Kernel Representative Selection |
Proceedings of the AAAI Conference on Artificial Intelligence
|
2023-12 |
A convertible neural processor supporting adaptive quantization for real-time neural networks |
Journal of Systems Architecture
|
2023-10 |
INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core |
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
|
2023-08 |
TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors |
Proceedings of the International Symposium on Low Power Electronics and Design
|
2023-07 |
Lightning Talk: Efficiency and Programmability of DNN Accelerators and GPUs |
Proceedings - Design Automation Conference
|
2023-07 |
Quixote: Improving Fidelity of Quantum Program by Independent Execution of Controlled Gates |
Proceedings - Design Automation Conference
|
2023-06 |
Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism |
Proceedings of the AAAI Conference on Artificial Intelligence
|
2023-06 |
R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs |
Proceedings - International Symposium on Computer Architecture
|
2023-02 |
SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs |
IEEE High-Performance Computer Architecture Symposium Proceedings
|
2022-12 |
CASH-RF: A Compiler-Assisted Hierarchical Register File in GPUs |
IEEE Embedded Systems Letters
|
2022-10 |
Reconstructing Out-of-Order Issue Queue |
Proceedings of the Annual International Symposium on Microarchitecture, MICRO
|
2021-07 |
PIMCaffe: Functional Evaluation of a Machine Learning Framework for In-Memory Neural Processing Unit |
IEEE Access
|
2021-06 |
SPACE: Locality-Aware Processing in Heterogeneous Memory for Personalized Recommendations |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2021-06 |
Two-Stage In-Storage Processing and Scheduling for Pattern Matching Applications |
IEEE Access
|
2020-11 |
Duplo: Lifting redundant memory accesses of deep neural networks for gpu tensor cores |
Proceedings of the Annual International Symposium on Microarchitecture, MICRO
|
2020-07 |
Check-In: In-Storage Checkpointing for Key-Value Store System Leveraging Flash-Based SSDs |
Proceedings - International Symposium on Computer Architecture
|
2020-07 |
Hi-End: Hierarchical, Endurance-Aware STT-MRAM-Based Register File for Energy-Efficient GPUs |
IEEE Access
|
2020-05 |
REACT: Scalable and High-Performance Regular Expression Pattern Matching Accelerator for In-Storage Processing |
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
|
2020-02 |
CASINO Core Microarchitecture: Generating Out-of-Order Schedules Using Cascaded In-Order Scheduling Windows |
IEEE High-Performance Computer Architecture Symposium Proceedings
|
2019-12 |
OverCome: Coarse-Grained Instruction Commit with Handover Register Renaming |
IEEE TRANSACTIONS ON COMPUTERS
|
2019-10 |
Solid-state Drive 내장형 SIMT 기반 MapReduce 가속기 구조 설계 |
전자공학회논문지
|
2019-09 |
SSD 내장형 경량화 된 정규표현식 매칭 가속기 구조의 설계 |
전자공학회논문지
|
2019-06 |
Linebacker: Preserving Victim Cache Lines in Idle Register Files of GPUs |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2019-05 |
Contents-aware partitioning algorithm for parallel high efficiency video coding |
MULTIMEDIA TOOLS AND APPLICATIONS
|
2019-05 |
Fast CU Depth Decision for HEVC Using Neural Networks |
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
|
2019-04 |
Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs |
IEEE TRANSACTIONS ON COMPUTERS
|
2018-10 |
FineReg: Fine-Grained Register File Management for Augmenting GPU Throughput |
Proceedings of the Annual International Symposium on Microarchitecture, MICRO
|
2018-09 |
WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs |
IEEE TRANSACTIONS ON COMPUTERS
|
2018-09 |
Exploiting Pseudo-Quadtree Structure for Accelerating HEVC Spatial Resolution Downscaling Transcoder |
IEEE TRANSACTIONS ON MULTIMEDIA
|
2018-04 |
Architectural Protection of Application Privacy against Software and Physical Attacks in Untrusted Cloud Environment |
IEEE Transactions on Cloud Computing
|
2018-04 |
Simultaneous and Speculative Thread Migration for Improving Energy Efficiency of Heterogeneous Core Architectures |
IEEE TRANSACTIONS ON COMPUTERS
|
2018-02 |
WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs |
IEEE High-Performance Computer Architecture Symposium Proceedings
|
2017-11 |
Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs |
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
|
2017-06 |
Access Pattern-Aware Cache Management for Improving Data Utilization in GPU |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2017-06 |
Dynamic Load Balancing of Dispatch Scheduling for Solid State Disks |
IEEE TRANSACTIONS ON COMPUTERS
|
2017-05 |
Improving Energy Efficiency of GPUs through Data Compression and Compressed Execution |
IEEE TRANSACTIONS ON COMPUTERS
|
2016-09 |
인공신경망 연산을 위한 하드웨어 가속기 최신 연구 동향 |
정보과학회지
|
2016-06 |
Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2016-06 |
APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2016-05 |
Server side, play buffer based quality control for adaptive media streaming |
MULTIMEDIA TOOLS AND APPLICATIONS
|
2016-04 |
Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph |
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
|
2016-04 |
Parallel GPU Architecture Simulation Framework Exploiting Architectural-Level Parallelism with Timing Error Prediction |
IEEE TRANSACTIONS ON COMPUTERS
|
2016-03 |
Warped-preexecution: A GPU pre-execution approach for improving latency hiding |
IEEE High-Performance Computer Architecture Symposium Proceedings
|
2015-12 |
A Performance-Energy Model to Evaluate Single Thread Execution Acceleration |
IEEE COMPUTER ARCHITECTURE LETTERS
|
2015-12 |
Dynamic Load Balancing of Parallel SURF with Vertical Partitioning |
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
|
2015-10 |
Network Variation and Fault Tolerant Performance Acceleration in Mobile Devices with Simultaneous Remote Execution |
IEEE TRANSACTIONS ON COMPUTERS
|
2015-06 |
Another Look at Secure Big Data Processing: Formal Framework and a Potential Approach |
IEEE International Conference on Cloud Computing, CLOUD
|
2015-06 |
Enhancing Software Dependability and Security with Hardware Supported Instruction Address Space Randomization |
Proceedings : International Conference on Dependable Systems and Networks
|
2015-06 |
Warped-Compression: Enabling Power Efficient GPUs through Register Compression |
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
|
2015-04 |
Highly Secure Mobile Devices Assisted with Trusted Cloud Computing Environments |
ETRI JOURNAL
|
2014-12 |
A malicious pattern detection engine for embedded security systems in the internet of things |
SENSORS
|
2014-09 |
멀티 프로세스를 사용한 가상 머신에서의 소프트웨어 로드 밸런서의 효율적인 물리 자원 활용 연구 |
전자공학회논문지
|
2014-08 |
C-Lock: Energy Efficient Synchronization for Embedded Multicore Systems |
IEEE TRANSACTIONS ON COMPUTERS
|
2014-07 |
동영상 스트리밍 서비스의 QoS유지를 위한 듀얼 트랜스코딩 기법 |
정보처리학회논문지. 컴퓨터 및 통신시스템
|
2014-07 |
Complexity-Effective Contention Management with Dynamic Backoff for Transactional Memory Systems |
IEEE TRANSACTIONS ON COMPUTERS
|
2014-07 |
Swarm processor system: Hardware process scheduler based energy efficient multi-core system |
IEICE ELECTRONICS EXPRESS
|
2014-07 |
Architectural investigation of matrix data layout on multicore processors |
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE
|
2014-06 |
네트워크 보안을 위한 서픽스 트리 기반 고속 패턴 매칭 알고리즘 |
전자공학회논문지
|
2014-06 |
Exploiting Implementation Diversity and Partial Connection of Routers in Application-Specific Network-on-Chip Topology Synthesis |
IEEE TRANSACTIONS ON COMPUTERS
|
2014-06 |
Accelerating MapReduce framework on multi-GPU systems |
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS
|
2014-05 |
매니코어 GPU 구조의 성능 저하 요소 분석과 최신 연구 동향 |
정보과학회지
|
2014-04 |
Boosting CUDA Applications with CPU-GPU Hybrid Computing |
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
|
2013-10 |
Parallelized sub-resource loading for web rendering engine |
JOURNAL OF SYSTEMS ARCHITECTURE
|
2013-08 |
GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table |
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
|
2013-08 |
Design and Evaluation of Random Linear Network Coding Accelerators on FPGAs |
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS
|
2013-04 |
A distributed signature detection method for detecting intrusions in sensor systems |
SENSORS
|
2013-01 |
Exploiting SIMD parallelism on dynamically partitioned parallel network coding for P2P systems |
COMPUTERS & ELECTRICAL ENGINEERING
|
2013-01 |
Benefits of using parallelized non-progressive network coding |
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
|
2013-01 |
Importance of Coherence Protocols with Network Applications on Multi-Core Processors |
IEEE TRANSACTIONS ON COMPUTERS
|
2012-11 |
Multi-Threading and Suffix Grouping on Massive Multiple Pattern Matching Algorithm |
COMPUTER JOURNAL
|
2012-05 |
Offloading of media transcoding for high-quality multimedia services |
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS
|
2012-03 |
Design of power-efficient parallel pipelined Bloom filter |
ELECTRONICS LETTERS
|
2012-03 |
An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units |
JIPS(Journal of Information Processing Systems)
|
2012-02 |
Reconfigurable and Parallelized Network Coding Decoder for VANETs |
MOBILE INFORMATION SYSTEMS
|
2012-01 |
Accelerated Network Coding with Dynamic Stream Decomposition on Graphics Processing Unit |
COMPUTER JOURNAL
|
2011-12 |
A Novel Sequential Tree Algorithm Based on Scoreboard for MPI Broadcast Communication |
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
|
2011-08 |
Network Coding on Heterogeneous Multi-Core Processors for Wireless Sensor Networks |
SENSORS
|
2011-07 |
트랜잭셔널 메모리를 위한 효율적인 캐시 구조 |
전자공학회논문지 - CI
|
2011-07 |
A Low-Cost Standard Mode MPI Hardware Unit for Embedded MPSoC |
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
|
2011-02 |
다중 시그니처 비교를 통한 트랜잭셔널 메모리의 충돌해소 정책의 성능향상 |
정보처리학회논문지A,B,C,D
|
2011-01 |
집중 충돌 병렬 처리를 위한 효율적인 다중 코어 트랜잭셔널 메모리 |
전자공학회논문지 - CI
|
2010-12 |
CELL 프로세서를 이용한 SEED 블록 암호화 알고리즘의 효율적인 병렬화 기법 |
정보처리학회논문지A,B,C,D
|
2010-11 |
On improving parallelized network coding with dynamic partitioning |
IEEE Transactions on Parallel and Distributed Systems
|
2010-10 |
Multithreaded pattern matching algorithm with data rearrangement |
IEICE Electronics Express
|
2010-03 |
Hardware implementation of a tessellation accelerator for the OpenVG standard |
IEICE Electronics Express
|
2010-02 |
CAN(Controller Area Network) 통신을 지원하는 차량용 지능형 파워 스위치를 위한 임베디드 시스템 |
정보처리학회논문지C
|
2009-05 |
A complexity-effective microprocessor design with decoupled dispatch queues and prefetching |
Parallel Computing
|
2008-12 |
PERFORMANCE EVALUATION OF PROGRAMMING MODELS FOR SMP-BASED CLUSTERS |
Journal Of The Chinese Institute Of Engineers
|
2008-12 |
Efficient Peer-to-Peer File Sharing Using Network Coding in MANET |
Journal Of Communications And Networks
|
2008-11 |
A low-complexity microprocessor design with speculative pre-execution |
Journal Of Systems Architecture
|
2008-10 |
Simultaneous thin-thread processors for low-power embedded systems |
IEICE ELECTRONICS EXPRESS
|
2008-10 |
Delay Analysis of Car-to-Car Reliable Data Delivery Strategies Based on Data Mulling with Network Coding |
IEICE Transactions on Information and Systems
|